262 research outputs found

    On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing

    Get PDF
    One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects

    KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses

    Get PDF
    High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variation

    Rare copy number variation in cerebral palsy

    Get PDF
    As per publisher: published online 22 May 2013Recent studies have established the role of rare copy number variants (CNVs) in several neurological disorders but the contribution of rare CNVs to cerebral palsy (CP) is not known. Fifty Caucasian families having children with CP were studied using two microarray designs. Potentially pathogenic, rare (<1% population frequency) CNVs were identified, and their frequency determined, by comparing the CNVs found in cases with 8329 adult controls with no known neurological disorders. Ten of the 50 cases (20%) had rare CNVs of potential relevance to CP; there were a total of 14 CNVs, which were observed in <0.1% (<8/8329) of the control population. Eight inherited from an unaffected mother: a 751-kb deletion including FSCB, a 1.5-Mb duplication of 7q21.13, a 534-kb duplication of 15q11.2, a 446-kb duplication including CTNND2, a 219-kb duplication including MCPH1, a 169-kb duplication of 22q13.33, a 64-kb duplication of MC2R, and a 135-bp exonic deletion of SLC06A1. Three inherited from an unaffected father: a 386-kb deletion of 12p12.2-p12.1, a 234-kb duplication of 10q26.13, and a 4-kb exonic deletion of COPS3. The inheritance was unknown for three CNVs: a 157-bp exonic deletion of ACOX1, a 693-kb duplication of 17q25.3, and a 265-kb duplication of DAAM1. This is the first systematic study of CNVs in CP, and although it did not identify de novo mutations, has shown inherited, rare CNVs involving potentially pathogenic genes and pathways requiring further investigation.Gai McMichael, Santhosh Girirajan, Andres Moreno-De-Luca, Jozef Gecz, Chloe Shard, Lam Son Nguyen, Jillian Nicholl, Catherine Gibson, Eric Haan, Evan Eichler, Christa Lese Martin and Alastair MacLenna

    Correction: Exome Sequencing in an Admixed Isolated Population IndicatesNFXL1 Variants Confer a Risk for Specific Language Impairment

    Get PDF
    Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10–4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model

    An examination of the Apo-1/Fas promoter Mva I polymorphism in Japanese patients with multiple sclerosis

    Get PDF
    BACKGROUND: The Apo-1/Fas (CD95) molecule is an apoptosis-signaling cell surface receptor belonging to the tumor necrosis factor (TNF) receptor family. Both Fas and Fas ligand (FasL) are expressed in activated mature T cells, and prolonged cell activation induces susceptibility to Fas-mediated apoptosis. The Apo-1/Fas gene is located in a chromosomal region that shows linkage in multiple sclerosis (MS) genome screens, and studies indicate that there is aberrant expression of the Apo-1/Fas molecule in MS. METHODS: Mva I polymorphism on the Apo-1/Fas promoter gene was detected by PCR-RFLP from the DNA of 114 Japanese patients with conventional MS and 121 healthy controls. We investigated the association of the Mva I polymorphism in Japanese MS patients using a case-control association study design. RESULTS: We found no evidence that the polymorphism contributes to susceptibility to MS. Furthermore, there was no association between Apo-1/Fas gene polymorphisms and clinical course (relapsing-remitting course or secondary-progressive course). No significant association was observed between Apo-1/Fas gene polymorphisms and the age at disease onset. CONCLUSIONS: Overall, our findings suggest that Apo-1/Fas promoter gene polymorphisms are not conclusively related to susceptibility to MS or the clinical characteristics of Japanese patients with MS

    Copy number variations in East-Asian population and their evolutionary and functional implications

    Get PDF
    Recent discovery of the copy number variation (CNV) in normal individuals has widened our understanding of genomic variation. However, most of the reported CNVs have been identified in Caucasians, which may not be directly applicable to people of different ethnicities. To profile CNV in East-Asian population, we screened CNVs in 3578 healthy, unrelated Korean individuals, using the Affymetrix Genome-Wide Human SNP array 5.0. We identified 144 207 CNVs using a pooled data set of 100 randomly chosen Korean females as a reference. The average number of CNVs per genome was 40.3, which is higher than that of CNVs previously reported using lower resolution platforms. The median size of CNVs was 18.9 kb (range 0.2–5406 kb). Copy number losses were 4.7 times more frequent than copy number gains. CNV regions (CNVRs) were defined by merging overlapping CNVs identified in two or more samples. In total, 4003 CNVRs were defined encompassing 241.9 Mb accounting for ∼8% of the human genome. A total of 2077 CNVRs (51.9%) were potentially novel. Known CNVRs were larger and more frequent than novel CNVRs. Sixteen percent of the CNVRs were observed in ≥1% of study subjects and 24% overlapped with the OMIM genes. A total of 476 (11.9%) CNVRs were associated with segmental duplications. CNVS/CNVRs identified in this study will be valuable resources for studying human genome diversity and its association with disease

    Impact of whole genome amplification on analysis of copy number variants

    Get PDF
    Large-scale copy number variants (CNVs) have recently been recognized to play a role in human genome variation and disease. Approaches for analysis of CNVs in small samples such as microdissected tissues can be confounded by limited amounts of material. To facilitate analyses of such samples, whole genome amplification (WGA) techniques were developed. In this study, we explored the impact of Phi29 multiple-strand displacement amplification on detection of CNVs using oligonucleotide arrays. We extracted DNA from fresh frozen lymph node samples and used this for amplification and analysis on the Affymetrix Mapping 500k SNP array platform. We demonstrated that the WGA procedure introduces hundreds of potentially confounding CNV artifacts that can obscure detection of bona fide variants. Our analysis indicates that many artifacts are reproducible, and may correlate with proximity to chromosome ends and GC content. Pair-wise comparison of amplified products considerably reduced the number of apparent artifacts and partially restored the ability to detect real CNVs. Our results suggest WGA material may be appropriate for copy number analysis when amplified samples are compared to similarly amplified samples and that only the CNVs with the greatest significance values detected by such comparisons are likely to be representative of the unamplified samples

    Genome-Wide Association Study of Copy Number Variants Suggests LTBP1 and FGD4 Are Important for Alcohol Drinking

    Get PDF
    Alcohol dependence (AD) is a complex disorder characterized by psychiatric and physiological dependence on alcohol. AD is reflected by regular alcohol drinking, which is highly inheritable. In this study, to identify susceptibility genes associated with alcohol drinking, we performed a genome-wide association study of copy number variants (CNVs) in 2,286 Caucasian subjects with Affymetrix SNP6.0 genotyping array. We replicated our findings in 1,627 Chinese subjects with the same genotyping array. We identified two CNVs, CNV207 (combined p-value 1.91E-03) and CNV1836 (combined p-value 3.05E-03) that were associated with alcohol drinking. CNV207 and CNV1836 are located at the downstream of genes LTBP1 (870 kb) and FGD4 (400 kb), respectively. LTBP1, by interacting TGFB1, may down-regulate enzymes directly participating in alcohol metabolism. FGD4 plays a role in clustering and trafficking GABAA receptor and subsequently influence alcohol drinking through activating CDC42. Our results provide suggestive evidence that the newly identified CNV regions and relevant genes may contribute to the genetic mechanism of alcohol dependence

    Accounting for uncertainty when assessing association between copy number and disease: a latent class model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Copy number variations (CNVs) may play an important role in disease risk by altering dosage of genes and other regulatory elements, which may have functional and, ultimately, phenotypic consequences. Therefore, determining whether a CNV is associated or not with a given disease might be relevant in understanding the genesis and progression of human diseases. Current stage technology give CNV probe signal from which copy number status is inferred. Incorporating uncertainty of CNV calling in the statistical analysis is therefore a highly important aspect. In this paper, we present a framework for assessing association between CNVs and disease in case-control studies where uncertainty is taken into account. We also indicate how to use the model to analyze continuous traits and adjust for confounding covariates.</p> <p>Results</p> <p>Through simulation studies, we show that our method outperforms other simple methods based on inferring the underlying CNV and assessing association using regular tests that do not propagate call uncertainty. We apply the method to a real data set in a controlled MLPA experiment showing good results. The methodology is also extended to illustrate how to analyze aCGH data.</p> <p>Conclusion</p> <p>We demonstrate that our method is robust and achieves maximal theoretical power since it accommodates uncertainty when copy number status are inferred. We have made <monospace>R</monospace> functions freely available.</p
    corecore