437 research outputs found

    Multi-membership gene regulation in pathway based microarray analysis

    Get PDF
    This article is available through the Brunel Open Access Publishing Fund. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background: Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results: We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions: We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.The work was sponsored by the studentship scheme of the School of Information Systems, Computing and Mathematics, Brunel Universit

    UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

    Get PDF
    Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

    Validating module network learning algorithms using simulated data

    Get PDF
    In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.Comment: 13 pages, 6 figures + 2 pages, 2 figures supplementary informatio

    Positional dependence of transcriptional inhibition by DNA torsional stress in yeast chromosomes

    Get PDF
    How DNA helical tension is constrained along the linear chromosomes of eukaryotic cells is poorly understood. In this study, we induced the accumulation of DNA (+) helical tension in Saccharomyces cerevisiae cells and examined how DNA transcription was affected along yeast chromosomes. The results revealed that, whereas the overwinding of DNA produced a general impairment of transcription initiation, genes situated at <100 kb from the chromosomal ends gradually escaped from the transcription stall. This novel positional effect seemed to be a simple function of the gene distance to the telomere: It occurred evenly in all 32 chromosome extremities and was independent of the atypical structure and transcription activity of subtelomeric chromatin. These results suggest that DNA helical tension dissipates at chromosomal ends and, therefore, provides a functional indication that yeast chromosome extremities are topologically open. The gradual escape from the transcription stall along the chromosomal flanks also indicates that friction restrictions to DNA twist diffusion, rather than tight topological boundaries, might suffice to confine DNA helical tension along eukaryotic chromatin

    NEAT: An efficient network enrichment analysis test

    Get PDF
    Background: Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. Results: We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. Conclusions: NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat )

    A survey on feature weighting based K-Means algorithms

    Get PDF
    This is a pre-copyedited, author-produced PDF of an article accepted for publication in Journal of Classification [de Amorim, R. C., 'A survey on feature weighting based K-Means algorithms', Journal of Classification, Vol. 33(2): 210-242, August 25, 2016]. Subject to embargo. Embargo end date: 25 August 2017. The final publication is available at Springer via http://dx.doi.org/10.1007/s00357-016-9208-4 © Classification Society of North America 2016In a real-world data set there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be taken into account during the clustering process. With over 50 years of history, K-Means is arguably the most popular partitional clustering algorithm there is. The first K-Means based clustering algorithm to compute feature weights was designed just over 30 years ago. Various such algorithms have been designed since but there has not been, to our knowledge, a survey integrating empirical evidence of cluster recovery ability, common flaws, and possible directions for future research. This paper elaborates on the concept of feature weighting and addresses these issues by critically analysing some of the most popular, or innovative, feature weighting mechanisms based in K-Means.Peer reviewedFinal Accepted Versio

    A Genome-Wide Analysis of Promoter-Mediated Phenotypic Noise in Escherichia coli

    Get PDF
    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as “phenotypic noise.” In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alon

    Kinetic CRAC uncovers a role for Nab3 in determining gene expression profiles during stress

    Get PDF
    RNA-binding proteins play a key role in shaping gene expression profiles during stress, however, little is known about the dynamic nature of these interactions and how this influences the kinetics of gene expression. To address this, we developed kinetic cross-linking and analysis of cDNAs (\u3c7CRAC), an ultraviolet cross-linking method that enabled us to quantitatively measure the dynamics of protein\u2013RNA interactions in vivo on a minute time-scale. Here, using \u3c7CRAC we measure the global RNA-binding dynamics of the yeast transcription termination factor Nab3 in response to glucose starvation. These measurements reveal rapid changes in protein\u2013RNA interactions within 1\u2009min following stress imposition. Changes in Nab3 binding are largely independent of alterations in transcription rate during the early stages of stress response, indicating orthogonal transcriptional control mechanisms. We also uncover a function for Nab3 in dampening expression of stress-responsive genes. \u3c7CRAC has the potential to greatly enhance our understanding of in vivo dynamics of protein\u2013RNA interactions

    The Yeast La Related Protein Slf1p Is a Key Activator of Translation during the Oxidative Stress Response

    Get PDF
    The mechanisms by which RNA-binding proteins control the translation of subsets of mRNAs are not yet clear. Slf1p and Sro9p are atypical-La motif containing proteins which are members of a superfamily of RNA-binding proteins conserved in eukaryotes. RIP-Seq analysis of these two yeast proteins identified overlapping and distinct sets of mRNA targets, including highly translated mRNAs such as those encoding ribosomal proteins. In paralell, transcriptome analysis of slf1Δ and sro9Δ mutant strains indicated altered gene expression in similar functional classes of mRNAs following loss of each factor. The loss of SLF1 had a greater impact on the transcriptome, and in particular, revealed changes in genes involved in the oxidative stress response. slf1Δ cells are more sensitive to oxidants and RIP-Seq analysis of oxidatively stressed cells enriched Slf1p targets encoding antioxidants and other proteins required for oxidant tolerance. To quantify these effects at the protein level, we used label-free mass spectrometry to compare the proteomes of wild-type and slf1Δ strains following oxidative stress. This analysis identified several proteins which are normally induced in response to hydrogen peroxide, but where this increase is attenuated in the slf1Δ mutant. Importantly, a significant number of the mRNAs encoding these targets were also identified as Slf1p-mRNA targets. We show that Slf1p remains associated with the few translating ribosomes following hydrogen peroxide stress and that Slf1p co-immunoprecipitates ribosomes and members of the eIF4E/eIF4G/Pab1p ‘closed loop’ complex suggesting that Slf1p interacts with actively translated mRNAs following stress. Finally, mutational analysis of SLF1 revealed a novel ribosome interacting domain in Slf1p, independent of its RNA binding La-motif. Together, our results indicate that Slf1p mediates a translational response to oxidative stress via mRNA-specific translational control

    Hsp90 orchestrates transcriptional regulation by Hsf1 and cell wall remodelling by MAPK signalling during thermal adaptation in a pathogenic yeast

    Get PDF
    Acknowledgments We thank Rebecca Shapiro for creating CaLC1819, CaLC1855 and CaLC1875, Gillian Milne for help with EM, Aaron Mitchell for generously providing the transposon insertion mutant library, Jesus Pla for generously providing the hog1 hst7 mutant, and Cathy Collins for technical assistance.Peer reviewedPublisher PD
    corecore