90 research outputs found
Haplotype differences for copy number variants in the 22q11.23 region among human populations: a pigmentation-based model for selective pressure.
Two gene clusters are tightly linked in a narrow region of chromosome 22q11.23: the macrophage migration inhibitory factor (MIF) gene family and the glutathione S-transferase theta class. Within 120 kb in this region, two 30-kb deletions reach high frequencies in human populations. This gives rise to four haplotypic arrangements, which modulate the number of genes in both families. The variable patterns of linkage disequilibrium (LD) between these copy number variants (CNVs) in diverse human populations remain poorly understood. We analyzed 2469 individuals belonging to 27 human populations with different ethnic origins. Then we correlated the genetic variability of 22q11.23 CNVs with environmental variables. We confirmed an increasing strength of LD from Africa to Asia and to Europe. Further, we highlighted strongly significant correlations between the frequency of one of the haplotypes and pigmentation-related variables: skin color (R2=0.675, P<0.001), distance from the equator (R2=0.454, P<0.001), UVA radiation (R2=0.439, P<0.001), and UVB radiation (R2=0.313, P=0.002). The fact that all MIF-related genes are retained on this haplotype and the evidences gleaned from experimental systems seem to agree with the role of MIF-related genes in melanogenesis. As such, we propose a model that explains the geographic and ethnic distribution of 22q11.23 CNVs among human populations, assuming that MIF-related gene dosage could be associated with adaptation to low UV radiatio
Extensive Copy-Number Variation of Young Genes across Stickleback Populations
MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Long non-coding RNAs and cancer: a new frontier of translational research?
Author manuscriptTiling array and novel sequencing technologies have made available the transcription profile of the entire human genome. However, the extent of transcription and the function of genetic elements that occur outside of protein-coding genes, particularly those involved in disease, are still a matter of debate. In this review, we focus on long non-coding RNAs (lncRNAs) that are involved in cancer. We define lncRNAs and present a cancer-oriented list of lncRNAs, list some tools (for example, public databases) that classify lncRNAs or that scan genome spans of interest to find whether known lncRNAs reside there, and describe some of the functions of lncRNAs and the possible genetic mechanisms that underlie lncRNA expression changes in cancer, as well as current and potential future applications of lncRNA research in the treatment of cancer.RS is supported as a fellow of the TALENTS Programme (7th R&D Framework Programme, Specific Programme: PEOPLE—Marie Curie Actions—COFUND). MIA is supported as a PhD fellow of the FCT (Fundação para a Ciência e Tecnologia), Portugal. GAC is supported as a fellow by The University of Texas MD Anderson Cancer Center Research Trust, as a research scholar by The University of Texas System Regents, and by the Chronic Lymphocytic Leukemia Global Research Foundation. Work in GAC’s laboratory is supported in part by the NIH/ NCI (CA135444); a Department of Defense Breast Cancer Idea Award; Developmental Research Awards from the Breast Cancer, Ovarian Cancer, Brain Cancer, Multiple Myeloma and Leukemia Specialized Programs of Research Excellence (SPORE) grants from the National Institutes of Health; a 2009 Seena Magowitz–Pancreatic Cancer Action Network AACR Pilot Grant; the Laura and John Arnold Foundation and the RGK Foundation
Genetic and epigenetic variations contributed by Alu retrotransposition
<p>Abstract</p> <p>Background</p> <p><it>De novo </it>retrotransposition of Alu elements has been recognized as a major driver for insertion polymorphisms in human populations. In this study, we exploited Alu-anchored bisulfite PCR libraries to identify evolutionarily recent Alu element insertions, and to investigate their genetic and epigenetic variation.</p> <p>Results</p> <p>A total of 327 putatively recent Alu insertions were identified, altogether represented by 1,762 sequence reads. Nearly all such <it>de novo </it>retrotransposition events (316/327) were novel. Forty-seven out of forty-nine randomly selected events, corresponding to nineteen genomic loci, were sequence-verified. Alu element insertions remained hemizygous in one or more individuals in sixteen of the nineteen genomic loci. The Alu elements were found to be enriched for young Alu families with characteristic sequence features, such as the presence of a longer poly(A) tail. In addition, we documented the occurrence of a duplication of the AT-rich target site in their immediate flanking sequences, a hallmark of retrotransposition. Furthermore, we found the sequence motif (TT/AAAA) that is recognized by the ORF2P protein encoded by LINE-1 in their 5'-flanking regions, consistent with the fact that Alu retrotransposition is facilitated by LINE-1 elements. While most of these Alu elements were heavily methylated, we identified an Alu localized 1.5 kb downstream of TOMM5 that exhibited a completely unmethylated left arm. Interestingly, we observed differential methylation of its immediate 5' and 3' flanking CpG dinucleotides, in concordance with the unmethylated and methylated statuses of its internal 5' and 3' sequences, respectively. Importantly, TOMM5's CpG island and the 3 Alu repeats and 1 MIR element localized upstream of this newly inserted Alu were also found to be unmethylated. Methylation analyses of two additional genomic loci revealed no methylation differences in CpG dinucleotides flanking the Alu insertion sites in the two homologous chromosomes, irrespective of the presence or absence of the insertion.</p> <p>Conclusions</p> <p>We anticipate that the combination of methodologies utilized in this study, which included repeat-anchored bisulfite PCR sequencing and the computational analysis pipeline herein reported, will prove invaluable for the generation of genetic and epigenetic variation maps.</p
Whole Genome Resequencing Reveals Natural Target Site Preferences of Transposable Elements in Drosophila melanogaster
Transposable elements are mobile DNA sequences that integrate into host genomes using diverse mechanisms with varying degrees of target site specificity. While the target site preferences of some engineered transposable elements are well studied, the natural target preferences of most transposable elements are poorly characterized. Using population genomic resequencing data from 166 strains of Drosophila melanogaster, we identified over 8,000 new insertion sites not present in the reference genome sequence that we used to decode the natural target preferences of 22 families of transposable element in this species. We found that terminal inverted repeat transposon and long terminal repeat retrotransposon families present clade-specific target site duplications and target site sequence motifs. Additionally, we found that the sequence motifs at transposable element target sites are always palindromes that extend beyond the target site duplication. Our results demonstrate the utility of population genomics data for high-throughput inference of transposable element targeting preferences in the wild and establish general rules for terminal inverted repeat transposon and long terminal repeat retrotransposon target site selection in eukaryotic genomes
Evolutionary Conservation of the Functional Modularity of Primate and Murine LINE-1 Elements
LINE-1 (L1) retroelements emerged in mammalian genomes over 80 million years ago with a few dominant subfamilies amplifying over discrete time periods that led to distinct human and mouse L1 lineages. We evaluated the functional conservation of L1 sequences by comparing retrotransposition rates of chimeric human-rodent L1 constructs to their parental L1 counterparts. Although amino acid conservation varies from ∼35% to 63% for the L1 ORF1p and ORF2p, most human and mouse L1 sequences can be functionally exchanged. Replacing either ORF1 or ORF2 to create chimeric human-mouse L1 elements did not adversely affect retrotransposition. The mouse ORF2p retains retrotransposition-competency to support both Alu and L1 mobilization when any of the domain sequences we evaluated were substituted with human counterparts. However, the substitution of portions of the mouse cys-domain into the human ORF2p reduces both L1 retrotransposition and Alu trans-mobilization by 200–1000 fold. The observed loss of ORF2p function is independent of the endonuclease or reverse transcriptase activities of ORF2p and RNA interaction required for reverse transcription. In addition, the loss of function is physically separate from the cysteine-rich motif sequence previously shown to be required for RNP formation. Our data suggest an additional role of the less characterized carboxy-terminus of the L1 ORF2 protein by demonstrating that this domain, in addition to mediating RNP interaction(s), provides an independent and required function for the retroelement amplification process. Our experiments show a functional modularity of most of the LINE sequences. However, divergent evolution of interactions within L1 has led to non-reciprocal incompatibilities between human and mouse ORF2 cys-domain sequences
A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans
As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers. Here we present a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data of 185 samples in three major populations detected with two detection methods. This catalog enables us to systematically study mutation rates, population segregation, genomic distribution, and functional properties of MEI polymorphisms and to compare MEI to SNP variation from the same individuals. Population allele frequencies of MEI and SNPs are described, broadly, by the same neutral ancestral processes despite vastly different mutation mechanisms and rates, except in coding regions where MEI are virtually absent, presumably due to strong negative selection. A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations
Long interspersed nuclear element-1 hypomethylation in cancer: biology and clinical applications
Epigenetic changes in long interspersed nuclear element-1s (LINE-1s or L1s) occur early during the process of carcinogenesis. A lower methylation level (hypomethylation) of LINE-1 is common in most cancers, and the methylation level is further decreased in more advanced cancers. Consequently, several previous studies have suggested the use of LINE-1 hypomethylation levels in cancer screening, risk assessment, tumor staging, and prognostic prediction. Epigenomic changes are complex, and global hypomethylation influences LINE-1s in a generalized fashion. However, the methylation levels of some loci are dependent on their locations. The consequences of LINE-1 hypomethylation are genomic instability and alteration of gene expression. There are several mechanisms that promote both of these consequences in cis. Therefore, the methylation levels of different sets of LINE-1s may represent certain phenotypes. Furthermore, the methylation levels of specific sets of LINE-1s may indicate carcinogenesis-dependent hypomethylation. LINE-1 methylation pattern analysis can classify LINE-1s into one of three classes based on the number of methylated CpG dinucleotides. These classes include hypermethylation, partial methylation, and hypomethylation. The number of partial and hypermethylated loci, but not hypomethylated LINE-1s, is different among normal cell types. Consequently, the number of hypomethylated loci is a more promising marker than methylation level in the detection of cancer DNA. Further genome-wide studies to measure the methylation level of each LINE-1 locus may improve PCR-based methylation analysis to allow for a more specific and sensitive detection of cancer DNA or for an analysis of certain cancer phenotypes
Genomic copy number variation in Mus musculus.
BACKGROUND: Copy number variation is an important dimension of genetic diversity and has implications in development and disease. As an important model organism, the mouse is a prime candidate for copy number variant (CNV) characterization, but this has yet to be completed for a large sample size. Here we report CNV analysis of publicly available, high-density microarray data files for 351 mouse tail samples, including 290 mice that had not been characterized for CNVs previously.
RESULTS: We found 9634 putative autosomal CNVs across the samples affecting 6.87% of the mouse reference genome. We find significant differences in the degree of CNV uniqueness (single sample occurrence) and the nature of CNV-gene overlap between wild-caught mice and classical laboratory strains. CNV-gene overlap was associated with lipid metabolism, pheromone response and olfaction compared to immunity, carbohydrate metabolism and amino-acid metabolism for wild-caught mice and classical laboratory strains, respectively. Using two subspecies of wild-caught Mus musculus, we identified putative CNVs unique to those subspecies and show this diversity is better captured by wild-derived laboratory strains than by the classical laboratory strains. A total of 9 genic copy number variable regions (CNVRs) were selected for experimental confirmation by droplet digital PCR (ddPCR).
CONCLUSION: The analysis we present is a comprehensive, genome-wide analysis of CNVs in Mus musculus, which increases the number of known variants in the species and will accelerate the identification of novel variants in future studies
Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch
Silver birch (Betula pendula) is a pioneer boreal tree that can be induced to flower within 1 year. Its rapid life cycle, small (440-Mb) genome, and advanced germplasm resources make birch an attractive model for forest biotechnology. We assembled and chromosomally anchored the nuclear genome of an inbred B. pendula individual. Gene duplicates from the paleohexaploid event were enriched for transcriptional regulation, whereas tandem duplicates were overrepresented by environmental responses. Population resequencing of 80 individuals showed effective population size crashes at major points of climatic upheaval. Selective sweeps were enriched among polyploid duplicates encoding key developmental and physiological triggering functions, suggesting that local adaptation has tuned the timing of and cross-talk between fundamental plant processes. Variation around the tightly-linked light response genes PHYC and FRS10 correlated with latitude and longitude and temperature, and with precipitation for PHYC. Similar associations characterized the growth-promoting cytokinin response regulator ARR1, and the wood development genes KAK and MED5A.Peer reviewe
- …
