43 research outputs found
Recommended from our members
Systematically Mapping the Epigenetic Context Dependence of Transcription Factor Binding
At the core of gene regulatory networks are transcription factors (TFs) that recognize specific DNA sequences and target distinct gene sets. Characterizing the DNA binding specificity of all TFs is a prerequisite for understanding global gene regulatory logic, which in recent years has resulted in the development of high-throughput methods that probe TF specificity in vitro and are now routinely used to inform or interpret in vivo studies. Despite the broad success of such methods, several challenges remain, two of which are addressed in this thesis.
Genomic DNA can harbor different epigenetic marks that have the potential to alter TF binding, the most prominent being CpG methylation. Given the vast number of modified CpGs in the human genome and an increasing body of literature suggesting a link between epigenetic changes and genome instability, or the onset of disease such as cancer, methods that can characterize the sensitivity of TFs to DNA methylation are needed to mechanistically interpret its impact on gene expression. We developed a high-throughput in vitro method (EpiSELEX-seq) that probes TF binding to unmodified and modified DNA sequences in competition, resulting in high-resolution maps of TF binding preferences. We found that methylation sensitivity can vary between TFs of the the same structural family and is dependent on the position of the 5mCpG within the TF binding site. The importance of our in vitro profiling of methylation sensitivity is demonstrated by the preference of human p53 tetramers for 5mCpGs within its binding site core. This previously unknown, stabilizing effect is also detectable in p53 ChIP-seq data when comparing methylated and unmethylated sites genome-wide.
A second impediment to predicting TF binding is our limited understanding of i) how cooperative participation of a TF in different complexes can alter their binding preference, and ii) how the detailed shape of DNA aids in creating a substrate for adaptive multi-TF binding. To address these questions in detail, we studied the in vitro binding preferences of three D. melanogaster homeodomain TFs: Homothorax (Hth), Extradenticle(Exd) and one of the eight Hox proteins. In vivo, Hth occurs in two splice forms: with (HthFL) and without (HthHM) the DNA binding domain (DBD). HthHM-Exd itself is a Hox cofactor that has been shown to induce latent sequence specificity upon complex formation with Hox proteins. There are three possible complexes that can be formed, all potentially having specific target genes: HthHM-Exd-Hox, HthFL-Exd-Hox, and HthFL-Exd. We characterized the in vitro binding preferences of each of these by developing new computational approaches to analyze high-throughput SELEX-seq data. We found distinct orientation and spacing preference for HthFL-Exd-Hox, alternative recognition modes that depend on the affinity class a sequence falls into, and a strong preference for a narrow DNA minor grove near Exd's N-terminal DBD. Strikingly, this shape readout is crucial to stabilize the HthHM-Exd-Hox complex in the absence of a Hth DBD and can thus be used to distinguish HthHM from HthFL isoform binding. Mutating the amino acids responsible for the shape readout by Exd and reinserting the engineered protein into the fly genome allowed us to classify in vivo binding sites based on ChIP-seq signal comparison between “shape-mutant” and wild-type Exd.
In summary, the research presented here has investigated TF binding preferences beyond sequence context by combining novel high-throughput experimental and computational methods. This interdisciplinary approach has enabled us to study binding preferences of TF complexes with respect to the epigenetic landscape of their cognate binding sites. Our novel mechanistic insights into DNA shape readout have provided a new avenue of exploiting guided protein engineering to probe how specific TFs interact with their co-factors in a cellular context, and how flanking genomic sequence helps determine which multi-TF complexes will form and which binding mode a complex adopts
Recommended from our members
Systematic prediction of DNA shape changes due to CpG methylation explains epigenetic effects on protein–DNA binding
Background
DNA shape analysis has demonstrated the potential to reveal structure-based mechanisms of protein–DNA binding. However, information about the influence of chemical modification of DNA is limited. Cytosine methylation, the most frequent modification, represents the addition of a methyl group at the major groove edge of the cytosine base. In mammalian genomes, cytosine methylation most frequently occurs at CpG dinucleotides. In addition to changing the chemical signature of C/G base pairs, cytosine methylation can affect DNA structure. Since the original discovery of DNA methylation, major efforts have been made to understand its effect from a sequence perspective. Compared to unmethylated DNA, however, little structural information is available for methylated DNA, due to the limited number of experimentally determined structures. To achieve a better mechanistic understanding of the effect of CpG methylation on local DNA structure, we developed a high-throughput method, methyl-DNAshape, for predicting the effect of cytosine methylation on DNA shape.
Results
Using our new method, we found that CpG methylation significantly altered local DNA shape. Four DNA shape features—helix twist, minor groove width, propeller twist, and roll—were considered in this analysis. Distinct distributions of effect size were observed for different features. Roll and propeller twist were the DNA shape features most strongly affected by CpG methylation with an effect size depending on the local sequence context. Methylation-induced changes in DNA shape were predictive of the measured rate of cleavage by DNase I and suggest a possible mechanism for some of the methylation sensitivities that were recently observed for human Pbx-Hox complexes.
Conclusions
CpG methylation is an important epigenetic mark in the mammalian genome. Understanding its role in protein–DNA recognition can further our knowledge of gene regulation. Our high-throughput methyl-DNAshape method can be used to predict the effect of cytosine methylation on DNA shape and its subsequent influence on protein–DNA interactions. This approach overcomes the limited availability of experimental DNA structures that contain 5-methylcytosine
Quantitative Analysis of the DNA Methylation Sensitivity of Transcription Factor Complexes
Although DNA modifications play an important role in gene regulation, the underlying mechanisms remain elusive. We developed EpiSELEX-seq to probe the sensitivity of transcription factor binding to DNA modification in vitro using massively parallel sequencing. Feature-based modeling quantifies the effect of cytosine methylation (5mC) on binding free energy in a position-specific manner. Application to the human bZIP proteins ATF4 and C/EBPβ and three different Pbx-Hox complexes shows that 5mCpG can both increase and decrease affinity, depending on where the modification occurs within the protein-DNA interface. The TF paralogs tested vary in their methylation sensitivity, for which we provide a structural rationale. We show that 5mCpG can also enhance in vitro p53 binding and provide evidence for increased in vivo p53 occupancy at methylated binding sites, correlating with primed enhancer histone marks. Our results establish a powerful strategy for dissecting the epigenomic modulation of protein-DNA interactions and their role in gene regulation
Recommended from our members
Accurate and sensitive quantification of protein-DNA binding affinity
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes
A leukemia-protective germline variant mediates chromatin module formation via transcription factor nucleation
Non-coding variants coordinate transcription factor (TF) binding and chromatin mark enrichment changes over regions spanning >100 kb. These molecularly coordinated regions are named "variable chromatin modules" (VCMs), providing a conceptual framework of how regulatory variation might shape complex traits. To better understand the molecular mechanisms underlying VCM formation, here, we mechanistically dissect a VCM-modulating noncoding variant that is associated with reduced chronic lymphocytic leukemia (CLL) predisposition and disease progression. This common, germline variant constitutes a 5-bp indel that controls the activity of an AXIN2 gene-linked VCM by creating a MEF2 binding site, which, upon binding, activates a super-enhancer-like regulatory element. This triggers a large change in TF binding activity and chromatin state at an enhancer cluster spanning >150 kb, coinciding with subtle, long-range chromatin compaction and robust AXIN2 up-regulation. Our results support a model in which the indel acts as an AXIN2 VCM-activating TF nucleation event, which modulates CLL pathology
Network Propagation Reveals Novel Features Predicting Drug Response of Cancer Cell Lines
Translating data derived from cancer genomes into personalized cancer therapy is a holy grail of computational biology. An important, yet challenging, question in this undertaking is to relate features of tumor cells to clinical outcomes of anticancer drugs. Recent progress in large pharmacogenomic studies has provided a wealth of data about cancer cell lines, indicating that many genetic and gene expression candidates might predict the drug response of cancer cells. Unfortunately, most of the predicted features are inconsistent with current clinical knowledge and lack mutual dependencies that could explain their molecular mode of action. To address this question, we have developed a new method, named dNetFS, to prioritize genetic and gene expression features of cancer cell lines that predict drug response, by integrating genomic/pharmaceutical data, protein-protein interaction network, and prior knowledge of drug-targets interaction with the techniques of network propagation. Comparing with previous methods, dNetFS is more accurate in cross-validation analysis, and it is able to reveal the key pathways involved in drug response. It therefore provides a basis to identify the underlying molecular mechanism for a given compound in different genomic backgrounds
Structure of the GTPase heterodimer of chloroplast SRP54 and FtsY from Arabidopsis thaliana
Chromatin modules and their implication in genomic organization and gene regulation
Regulation of gene expression is a complex but highly guided process. While genomic technologies and computational approaches have allowed high-throughput mapping of cis-regulatory elements (CREs) and their interactions in 3D, their precise role in regulating gene expression remains obscure. Recent complementary observations revealed that interactions between CREs frequently result in the formation of small-scale functional modules within topologically associating domains. Such chromatin modules likely emerge from a complex interplay between regulatory machineries assembled at CREs, including site-specific binding of transcription factors. Here, we review the methods that allow identifying chromatin modules, summarize possible mechanisms that steer CRE interactions within these modules, and discuss outstanding challenges to uncover how chromatin modules fit in our current understanding of the functional 3D genome.UPDEPL
Low-Affinity Binding Sites and the Transcription Factor Specificity Paradox in Eukaryotes
Eukaryotic transcription factors (TFs) from the same structural family tend to bind similar DNA sequences, despite the ability of these TFs to execute distinct functions in vivo. The cell partly resolves this specificity paradox through combinatorial strategies and the use of low-affinity binding sites, which are better able to distinguish between similar TFs. However, because these sites have low affinity, it is challenging to understand how TFs recognize them in vivo. Here, we summarize recent findings and technological advancements that allow for the quantification and mechanistic interpretation of TF recognition across a wide range of affinities. We propose a model that integrates insights from the fields of genetics and cell biology to provide further conceptual understanding of TF binding specificity. We argue that in eukaryotes, target specificity is driven by an inhomogeneous 3D nuclear distribution of TFs and by variation in DNA binding affinity such that locally elevated TF concentration allows low-affinity binding sites to be functional. </jats:p
