552 research outputs found

    Detection of structural mosaicism from targeted and whole-genome sequencing data.

    Get PDF
    Structural mosaic abnormalities are large post-zygotic mutations present in a subset of cells and have been implicated in developmental disorders and cancer. Such mutations have been conventionally assessed in clinical diagnostics using cytogenetic or microarray testing. Modern disease studies rely heavily on exome sequencing, yet an adequate method for the detection of structural mosaicism using targeted sequencing data is lacking. Here, we present a method, called MrMosaic, to detect structural mosaic abnormalities using deviations in allele fraction and read coverage from next-generation sequencing data. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) simulations were used to calculate detection performance across a range of mosaic event sizes, types, clonalities, and sequencing depths. The tool was applied to 4911 patients with undiagnosed developmental disorders, and 11 events among nine patients were detected. For eight of these 11 events, mosaicism was observed in saliva but not blood, suggesting that assaying blood alone would miss a large fraction, possibly >50%, of mosaic diagnostic chromosomal rearrangements

    Quantifying single nucleotide variant detection sensitivity in exome sequencing

    Get PDF
    BACKGROUND: The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed. RESULTS: Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed. CONCLUSIONS: Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits

    Complete mitochondrial DNA sequences provide new insights into the Polynesian motif and the peopling of Madagascar

    Get PDF
    More than a decade of mitochondrial DNA (mtDNA) studies have given the 'Polynesian motif' renowned status as a marker for tracing the late-Holocene expansion of Austronesian speaking populations. Despite considerable research on the Polynesian motif in Oceania, there has been little equivalent work on the western edge of its expansion - leaving major issues unresolved regarding the motif's evolutionary history. This has also led to considerable uncertainty regarding the settlement of Madagascar. In this study, we assess mtDNA variation in 266 individuals from three Malagasy ethnic groups: the Mikea, Vezo, and Merina. Complete mtDNA genome sequencing reveals a new variant of the Polynesian motif in Madagascar; two coding region mutations define a Malagasy-specific sub-branch. This newly defined 'Malagasy motif' occurs at high frequency in all three ethnic groups (13-50%), and its phylogenetic position, geographic distribution, and estimated age all support a recent origin, but without conclusively identifying a specific source region. Nevertheless, the haplotype's limited diversity, similar to those of other mtDNA haplogroups found in our Malagasy groups, best supports a small number of initial settlers arriving to Madagascar through the same migratory process. Finally, the discovery of this lineage provides a set of new polymorphic positions to help localize the Austronesian ancestors of the Malagasy, as well as uncover the origin and evolution of the Polynesian motif itself

    Independent and population-specific association of risk variants at the IRGM locus with Crohn's disease

    Get PDF
    DNA polymorphisms in a region on chromosome 5q33.1 which contains two genes, immunity related GTPase related family, M (IRGM) and zinc finger protein 300 (ZNF300), are associated with Crohn's disease (CD). The deleted allele of a 20 kb copy number variation (CNV) upstream of IRGM was recently shown to be in strong linkage disequilibrium (LD) with the CD-associated single nucleotide polymorphisms and is itself associated with CD (P < 0.01). The deletion was correlated with increased or reduced expression of IRGM in transformed cells in a cell line-dependent manner, and has been proposed as a likely causal variant. We report here that small insertion/deletion polymorphisms in the promoter and 5′ untranslated region of IRGM are, together with the CNV, strongly associated with CD (P = 1.37 × 10−5 to 1.40 × 10−9), and that the CNV and the 5′-untranslated region variant −308(GTTT)5 contribute independently to CD susceptibility (P = 2.6 × 10−7 and P = 2 × 10−5, respectively). We also show that the CD risk haplotype is associated with a significant decrease in IRGM expression (P < 10−12) in untransformed lymphocytes from CD patients. Further analysis of these variants in a Japanese CD case-control sample and of IRGM expression in HapMap populations revealed that neither the IRGM insertion/deletion polymorphisms nor the CNV was associated with CD or with altered IRGM expression in the Asian population. This suggests that the involvement of the IRGM risk haplotype in the pathogenesis of CD requires gene-gene or gene-environment interactions which are absent in Asian populations, or that none of the variants analysed are causal, and that the true causal variants arose after the European-Asian spli

    Discovery of Western European R1b1a2 Y Chromosome Variants in 1000 Genomes Project Data: An Online Community Approach

    Get PDF
    The authors have used an online community approach, and tools that were readily available via the Internet, to discover genealogically and therefore phylogenetically relevant Y-chromosome polymorphisms within core haplogroup R1b1a2-L11/S127 (rs9786076). Presented here is the analysis of 135 unrelated L11 derived samples from the 1000 Genomes Project. We were able to discover new variants and build a much more complex phylogenetic relationship for L11 sub-clades. Many of the variants were further validated using PCR amplification and Sanger sequencing. The identification of these new variants will help further the understanding of population history including patrilineal migrations in Western and Central Europe where R1b1a2 is the most frequent haplogroup. The fine-grained phylogenetic tree we present here will also help to refine historical genetic dating studies. Our findings demonstrate the power of citizen science for analysis of whole genome sequence data

    Characterising and Predicting Haploinsufficiency in the Human Genome

    Get PDF
    Ni Huang is with the Wellcome Trust Sanger Institute, Insuk Lee is with UT Austin and Yonsei University, Edward M. Marcotte is with UT Austin, Matthew E. Hurles is with the Wellcome Trust Sanger Institute.Haploinsufficiency, wherein a single functional copy of a gene is insufficient to maintain normal function, is a major cause of dominant disease. Human disease studies have identified several hundred haploinsufficient (HI) genes. We have compiled a map of 1,079 haplosufficient (HS) genes by systematic identification of genes unambiguously and repeatedly compromised by copy number variation among 8,458 apparently healthy individuals and contrasted the genomic, evolutionary, functional, and network properties between these HS genes and known HI genes. We found that HI genes are typically longer and have more conserved coding sequences and promoters than HS genes. HI genes exhibit higher levels of expression during early development and greater tissue specificity. Moreover, within a probabilistic human functional interaction network HI genes have more interaction partners and greater network proximity to other known HI genes. We built a predictive model on the basis of these differences and annotated 12,443 genes with their predicted probability of being haploinsufficient. We validated these predictions of haploinsufficiency by demonstrating that genes with a high predicted probability of exhibiting haploinsufficiency are enriched among genes implicated in human dominant diseases and among genes causing abnormal phenotypes in heterozygous knockout mice. We have transformed these gene-based haploinsufficiency predictions into haploinsufficiency scores for genic deletions, which we demonstrate to better discriminate between pathogenic and benign deletions than consideration of the deletion size or numbers of genes deleted. These robust predictions of haploinsufficiency support clinical interpretation of novel loss-of-function variants and prioritization of variants and genes for follow-up studies.NH and MEH are funded by the Wellcome Trust [grant number 077014/Z/05/Z]. IL is funded by grants from the National Research Foundation of Korea (NRF) funded by the Korea government (MEST) (No. 2010-0017649, 2010-0015754, 2009-0087951) and EMM by the NIH, Welch (F-1515), and Packard Foundations, by the Texas Institute for Drug and Diagnostic Development, and by the Texas Advanced Research Program. This study makes use of data provided by the Genetic Association Information Network (GAIN) and the Wellcome Trust Case Control Consortium 2 (WTCCC2), through work funded by NIH and the Wellcome Trust. This study also makes use of data generated by the DECIPHER consortium, which is funded by the Wellcome Trust. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Cellular and Molecular Biolog

    Phenotype-specific effect of chromosome 1q21.1 rearrangements and GJA5 duplications in 2436 congenital heart disease patients and 6760 controls

    Get PDF
    Recurrent rearrangements of chromosome 1q21.1 that occur via non-allelic homologous recombination have been associated with variable phenotypes exhibiting incomplete penetrance, including congenital heart disease (CHD). However, the gene or genes within the ∼1 Mb critical region responsible for each of the associated phenotypes remains unknown. We examined the 1q21.1 locus in 948 patients with tetralogy of Fallot (TOF), 1488 patients with other forms of CHD and 6760 ethnically matched controls using single nucleotide polymorphism genotyping arrays (Illumina 660W and Affymetrix 6.0) and multiplex ligation-dependent probe amplification. We found that duplication of 1q21.1 was more common in cases of TOF than in controls [odds ratio (OR) 30.9, 95% confidence interval (CI) 8.9-107.6); P = 2.2 × 10−7], but deletion was not. In contrast, deletion of 1q21.1 was more common in cases of non-TOF CHD than in controls [OR 5.5 (95% CI 1.4-22.0); P = 0.04] while duplication was not. We also detected rare (n = 3) 100-200 kb duplications within the critical region of 1q21.1 in cases of TOF. These small duplications encompassed a single gene in common, GJA5, and were enriched in cases of TOF in comparison to controls [OR = 10.7 (95% CI 1.8-64.3), P = 0.01]. These findings show that duplication and deletion at chromosome 1q21.1 exhibit a degree of phenotypic specificity in CHD, and implicate GJA5 as the gene responsible for the CHD phenotypes observed with copy number imbalances at this locu

    Human spermatogenic failure purges deleterious mutation load from the autosomes and both sex chromosomes, including the gene DMRT1

    Get PDF
    Gonadal failure, along with early pregnancy loss and perinatal death, may be an important filter that limits the propagation of harmful mutations in the human population. We hypothesized that men with spermatogenic impairment, a disease with unknown genetic architecture and a common cause of male infertility, are enriched for rare deleterious mutations compared to men with normal spermatogenesis. After assaying genomewide SNPs and CNVs in 323 Caucasian men with idiopathic spermatogenic impairment and more than 1,100 controls, we estimate that each rare autosomal deletion detected in our study multiplicatively changes a man’s risk of disease by 10% (OR 1.10 [1.04–1.16], p,261023), rare X-linked CNVs by 29%, (OR 1.29 [1.11–1.50], p,161023), and rare Y-linked duplications by 88% (OR 1.88 [1.13–3.13], p,0.03). By contrasting the properties of our case-specific CNVs with those of CNV callsets from cases of autism, schizophrenia, bipolar disorder, and intellectual disability, we propose that the CNV burden in spermatogenic impairment is distinct from the burden of large, dominant mutations described for neurodevelopmental disorders. We identified two patients with deletions of DMRT1, a gene on chromosome 9p24.3 orthologous to the putative sex determination locus of the avian ZW chromosome system. In an independent sample of Han Chinese men, we identified 3 more DMRT1 deletions in 979 cases of idiopathic azoospermia and none in 1,734 controls, and found none in an additional 4,519 controls from public databases. The combined results indicate that DMRT1 loss-of-function mutations are a risk factor and potential genetic cause of human spermatogenic failure (frequency of 0.38% in 1306 cases and 0% in 7,754 controls, p = 6.261025). Our study identifies other recurrent CNVs as potential causes of idiopathic azoospermia and generates hypotheses for directing future studies on the genetic basis of male infertility and IVF outcomes.This work was partially funded by the Portuguese Foundation for Science and Technology FCT/MCTES (PIDDAC) and co-financed by European funds (FEDER) through the COMPETE program, research grant PTDC/SAU-GMG/101229/2008. IPATIMUP is an Associate Laboratory of the Portuguese Ministry of Science, Technology, and Higher Education and is partially supported by FCT. AML is the recipient of a postdoctoral fellowship from FCT (SFRH/BPD/73366/2010). CO is supported by a grant from the United States National Institutes of Health (R01 HD21244), JDS is supported by Damon Runyon Clinical Investigator Award, Alex's Lemonade Stand Foundation Epidemiology Award, and the Eunice Kennedy Shriver Children's Health Research Career Development Award NICHD 5K12HD001410. Support for humans studies and specimens were provided by the NIH/NIDDK George M. O'Brien Center for Kidney Disease Kidney Translational Research Core (P30DK079333) grant to Washington University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Looking to the future of zebrafish as a model to understand the genetic basis of eye disease

    Get PDF
    In this brief commentary, we provide some of our thoughts and opinions on the current and future use of zebrafish to model human eye disease, dissect pathological progression and advance in our understanding of the genetic bases of microphthalmia, andophthalmia and coloboma (MAC) in humans. We provide some background on eye formation in fish and conservation and divergence across vertebrates in this process, discuss different approaches for manipulating gene function and speculate on future research areas where we think research using fish may prove to be particularly effective
    corecore