176 research outputs found

    CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets.

    Get PDF
    UNLABELLED: Promoter capture Hi-C (PCHi-C) allows the genome-wide interrogation of physical interactions between distal DNA regulatory elements and gene promoters in multiple tissue contexts. Visual integration of the resultant chromosome interaction maps with other sources of genomic annotations can provide insight into underlying regulatory mechanisms. We have developed Capture HiC Plotter (CHiCP), a web-based tool that allows interactive exploration of PCHi-C interaction maps and integration with both public and user-defined genomic datasets. AVAILABILITY AND IMPLEMENTATION: CHiCP is freely accessible from www.chicp.org and supports most major HTML5 compliant web browsers. Full source code and installation instructions are available from http://github.com/D-I-L/django-chicp CONTACT: [email protected] is the published version. It first appeared at http://bioinformatics.oxfordjournals.org/content/early/2016/04/26/bioinformatics.btw173

    A systematic, large-scale comparison of transcription factor binding site models

    Get PDF
    Background The modelling of gene regulation is a major challenge in biomedical research. This process is dominated by transcription factors (TFs) and mutations in their binding sites (TFBSs) may cause the misregulation of genes, eventually leading to disease. The consequences of DNA variants on TF binding are modelled in silico using binding matrices, but it remains unclear whether these are capable of accurately representing in vivo binding. In this study, we present a systematic comparison of binding models for 82 human TFs from three freely available sources: JASPAR matrices, HT-SELEX-generated models and matrices derived from protein binding microarrays (PBMs). We determined their ability to detect experimentally verified “real” in vivo TFBSs derived from ENCODE ChIP-seq data. As negative controls we chose random downstream exonic sequences, which are unlikely to harbour TFBS. All models were assessed by receiver operating characteristics (ROC) analysis. Results While the area- under-curve was low for most of the tested models with only 47 % reaching a score of 0.7 or higher, we noticed strong differences between the various position-specific scoring matrices with JASPAR and HT-SELEX models showing higher success rates than PBM-derived models. In addition, we found that while TFBS sequences showed a higher degree of conservation than randomly chosen sequences, there was a high variability between individual TFBSs. Conclusions Our results show that only few of the matrix-based models used to predict potential TFBS are able to reliably detect experimentally confirmed TFBS. We compiled our findings in a freely accessible web application called ePOSSUM (http:/mutationtaster.charite.de/ePOSSUM/) which uses a Bayes classifier to assess the impact of genetic alterations on TF binding in user-defined sequences. Additionally, ePOSSUM provides information on the reliability of the prediction using our test set of experimentally confirmed binding sites

    Cellular dissection of psoriasis for transcriptome analyses and the post-GWAS era

    Get PDF
    Abstract Background Genome-scale studies of psoriasis have been used to identify genes of potential relevance to disease mechanisms. For many identified genes, however, the cell type mediating disease activity is uncertain, which has limited our ability to design gene functional studies based on genomic findings. Methods We identified differentially expressed genes (DEGs) with altered expression in psoriasis lesions (n = 216 patients), as well as candidate genes near susceptibility loci from psoriasis GWAS studies. These gene sets were characterized based upon their expression across 10 cell types present in psoriasis lesions. Susceptibility-associated variation at intergenic (non-coding) loci was evaluated to identify sites of allele-specific transcription factor binding. Results Half of DEGs showed highest expression in skin cells, although the dominant cell type differed between psoriasis-increased DEGs (keratinocytes, 35%) and psoriasis-decreased DEGs (fibroblasts, 33%). In contrast, psoriasis GWAS candidates tended to have highest expression in immune cells (71%), with a significant fraction showing maximal expression in neutrophils (24%, P < 0.001). By identifying candidate cell types for genes near susceptibility loci, we could identify and prioritize SNPs at which susceptibility variants are predicted to influence transcription factor binding. This led to the identification of potentially causal (non-coding) SNPs for which susceptibility variants influence binding of AP-1, NF-κB, IRF1, STAT3 and STAT4. Conclusions These findings underscore the role of innate immunity in psoriasis and highlight neutrophils as a cell type linked with pathogenetic mechanisms. Assignment of candidate cell types to genes emerging from GWAS studies provides a first step towards functional analysis, and we have proposed an approach for generating hypotheses to explain GWAS hits at intergenic loci.http://deepblue.lib.umich.edu/bitstream/2027.42/109537/1/12920_2013_Article_485.pd

    Stochastic EM-based TFBS motif discovery with MITSU

    Get PDF
    Motivation: The Expectation–Maximization (EM) algorithm has been successfully applied to the problem of transcription factor binding site (TFBS) motif discovery and underlies the most widely used motif discovery algorithms. In the wider field of probabilistic modelling, the stochastic EM (sEM) algorithm has been used to overcome some of the limitations of the EM algorithm; however, the application of sEM to motif discovery has not been fully explored. Results: We present MITSU (Motif discovery by ITerative Sampling and Updating), a novel algorithm for motif discovery, which combines sEM with an improved approximation to the likelihood function, which is unconstrained with regard to the distribution of motif occurrences within the input dataset. The algorithm is evaluated quantitatively on realistic synthetic data and several collections of characterized prokaryotic TFBS motifs and shown to outperform EM and an alternative sEM-based algorithm, particularly in terms of site-level positive predictive value. Availability and implementation: Java executable available for download at http://www.sourceforge.net/p/mitsu-motif/, supported on Linux/OS X. Contact: [email protected]

    Lineage-specific dynamic and pre-established enhancer–promoter contacts cooperate in terminal differentiation

    Get PDF
    Chromosome conformation is an important feature of metazoan gene regulation; however, enhancer–promoter contact remodeling during cellular differentiation remains poorly understood. To address this, genome-wide promoter capture Hi-C (CHi-C) was performed during epidermal differentiation. Two classes of enhancer–promoter contacts associated with differentiation-induced genes were identified. The first class ('gained') increased in contact strength during differentiation in concert with enhancer acquisition of the H3K27ac activation mark. The second class ('stable') were pre-established in undifferentiated cells, with enhancers constitutively marked by H3K27ac. The stable class was associated with the canonical conformation regulator cohesin, whereas the gained class was not, implying distinct mechanisms of contact formation and regulation. Analysis of stable enhancers identified a new, essential role for a constitutively expressed, lineage-restricted ETS-family transcription factor, EHF, in epidermal differentiation. Furthermore, neither class of contacts was observed in pluripotent cells, suggesting that lineage-specific chromatin structure is established in tissue progenitor cells and is further remodeled in terminal differentiation

    CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data

    Get PDF
    Capture Hi-C (CHi-C) is a method for profiling chromosomal interactions involving targeted regions of interest, such as gene promoters, globally and at high resolution. Signal detection in CHi-C data involves a number of statistical challenges that are not observed when using other Hi-C-like techniques. We present a background model and algorithms for normalisation and multiple testing that are specifically adapted to CHi-C experiments. We implement these procedures in CHiCAGO ( http://regulatorygenomicsgroup.org/chicago ), an open-source package for robust interaction detection in CHi-C. We validate CHiCAGO by showing that promoter-interacting regions detected with this method are enriched for regulatory features and disease-associated SNPs

    Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters

    Get PDF
    Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases

    Gene Properties and Chromatin State Influence the Accumulation of Transposable Elements in Genes

    Get PDF
    Transposable elements (TEs) are mobile DNA sequences found in the genomes of almost all species. By measuring the normalized coverage of TE sequences within genes, we identified sets of genes with conserved extremes of high/low TE density in the genomes of human, mouse and cow and denoted them as ‘shared upper/lower outliers (SUOs/SLOs)’. By comparing these outlier genes to the genomic background, we show that a large proportion of SUOs are involved in metabolic pathways and tend to be mammal-specific, whereas many SLOs are related to developmental processes and have more ancient origins. Furthermore, the proportions of different types of TEs within human and mouse orthologous SUOs showed high similarity, even though most detectable TEs in these two genomes inserted after their divergence. Interestingly, our computational analysis of polymerase-II (Pol-II) occupancy at gene promoters in different mouse tissues showed that 60% of tissue-specific SUOs show strong Pol-II binding only in embryonic stem cells (ESCs), a proportion significantly higher than the genomic background (37%). In addition, our analysis of histone marks such as H3K4me3 and H3K27me3 in mouse ESCs also suggest a strong association between TE-rich genes and open-chromatin at promoters. Finally, two independent whole-transcriptome datasets show a positive association between TE density and gene expression level in ESCs. While this study focuses on genes with extreme TE densities, the above results clearly show that the probability of TE accumulation/fixation in mammalian genes is not random and is likely associated with different factors/gene properties and, most importantly, an association between the TE insertion/fixation rate and gene activity status in ES cells

    Global Mapping of DNA Methylation in Mouse Promoters Reveals Epigenetic Reprogramming of Pluripotency Genes

    Get PDF
    DNA methylation patterns are reprogrammed in primordial germ cells and in preimplantation embryos by demethylation and subsequent de novo methylation. It has been suggested that epigenetic reprogramming may be necessary for the embryonic genome to return to a pluripotent state. We have carried out a genome-wide promoter analysis of DNA methylation in mouse embryonic stem (ES) cells, embryonic germ (EG) cells, sperm, trophoblast stem (TS) cells, and primary embryonic fibroblasts (pMEFs). Global clustering analysis shows that methylation patterns of ES cells, EG cells, and sperm are surprisingly similar, suggesting that while the sperm is a highly specialized cell type, its promoter epigenome is already largely reprogrammed and resembles a pluripotent state. Comparisons between pluripotent tissues and pMEFs reveal that a number of pluripotency related genes, including Nanog, Lefty1 and Tdgf1, as well as the nucleosome remodeller Smarcd1, are hypomethylated in stem cells and hypermethylated in differentiated cells. Differences in promoter methylation are associated with significant differences in transcription levels in more than 60% of genes analysed. Our comparative approach to promoter methylation thus identifies gene candidates for the regulation of pluripotency and epigenetic reprogramming. While the sperm genome is, overall, similarly methylated to that of ES and EG cells, there are some key exceptions, including Nanog and Lefty1, that are highly methylated in sperm. Nanog promoter methylation is erased by active and passive demethylation after fertilisation before expression commences in the morula. In ES cells the normally active Nanog promoter is silenced when targeted by de novo methylation. Our study suggests that reprogramming of promoter methylation is one of the key determinants of the epigenetic regulation of pluripotency genes. Epigenetic reprogramming in the germline prior to fertilisation and the reprogramming of key pluripotency genes in the early embryo is thus crucial for transmission of pluripotency
    corecore