1,875 research outputs found
Applying weighted network measures to microarray distance matrices
In recent work we presented a new approach to the analysis of weighted
networks, by providing a straightforward generalization of any network measure
defined on unweighted networks. This approach is based on the translation of a
weighted network into an ensemble of edges, and is particularly suited to the
analysis of fully connected weighted networks. Here we apply our method to
several such networks including distance matrices, and show that the clustering
coefficient, constructed by using the ensemble approach, provides meaningful
insights into the systems studied. In the particular case of two data sets from
microarray experiments the clustering coefficient identifies a number of
biologically significant genes, outperforming existing identification
approaches.Comment: Accepted for publication in J. Phys.
Dynamics of gene expression and the regulatory inference problem
From the response to external stimuli to cell division and death, the
dynamics of living cells is based on the expression of specific genes at
specific times. The decision when to express a gene is implemented by the
binding and unbinding of transcription factor molecules to regulatory DNA.
Here, we construct stochastic models of gene expression dynamics and test them
on experimental time-series data of messenger-RNA concentrations. The models
are used to infer biophysical parameters of gene transcription, including the
statistics of transcription factor-DNA binding and the target genes controlled
by a given transcription factor.Comment: revised version to appear in Europhys. Lett., new titl
Regulatory networks and connected components of the neutral space
The functioning of a living cell is largely determined by the structure of
its regulatory network, comprising non-linear interactions between regulatory
genes. An important factor for the stability and evolvability of such
regulatory systems is neutrality - typically a large number of alternative
network structures give rise to the necessary dynamics. Here we study the
discretized regulatory dynamics of the yeast cell cycle [Li et al., PNAS, 2004]
and the set of networks capable of reproducing it, which we call functional.
Among these, the empirical yeast wildtype network is close to optimal with
respect to sparse wiring. Under point mutations, which establish or delete
single interactions, the neutral space of functional networks is fragmented
into 4.7 * 10^8 components. One of the smaller ones contains the wildtype
network. On average, functional networks reachable from the wildtype by
mutations are sparser, have higher noise resilience and fewer fixed point
attractors as compared with networks outside of this wildtype component.Comment: 6 pages, 5 figure
Parametric study of modelling structural timber in fire with different software packages
In a bid to accurately model structural behaviour of timber buildings in fire, a number of obstacles have been identified which must be fully understood before advanced computer modelling can accurately be used to represent physical behaviour. This paper discusses the obstacles, with suggestions on how to mitigate them, incorporating the challenges of using general purpose finite element software. The paper examines modelling with ANSYS, SAFIR and ABAQUS and the individual and collective challenges related to thermal analyses of timber structures in fire conditions. It considers the effects various model parameters (thermal and structural) may have on physical interpretation of experimental data in comparison with the accuracy of numerical solutions. In detail, the study looks at the effects of 1D and 2D heat transfer analyses, finite element mesh sizes, time steps and different thermal property approaches on thermal models of timber members in fires. It further recommends how best to model these structures using the different finite element software packages
The Iterative Signature Algorithm for the analysis of large scale gene expression data
We present a new approach for the analysis of genome-wide expression data.
Our method is designed to overcome the limitations of traditional techniques,
when applied to large-scale data. Rather than alloting each gene to a single
cluster, we assign both genes and conditions to context-dependent and
potentially overlapping transcription modules. We provide a rigorous definition
of a transcription module as the object to be retrieved from the expression
data. An efficient algorithm, that searches for the modules encoded in the data
by iteratively refining sets of genes and conditions until they match this
definition, is established. Each iteration involves a linear map, induced by
the normalized expression matrix, followed by the application of a threshold
function. We argue that our method is in fact a generalization of Singular
Value Decomposition, which corresponds to the special case where no threshold
is applied. We show analytically that for noisy expression data our approach
leads to better classification due to the implementation of the threshold. This
result is confirmed by numerical analyses based on in-silico expression data.
We discuss briefly results obtained by applying our algorithm to expression
data from the yeast S. cerevisiae.Comment: Latex, 36 pages, 8 figure
SMART: Unique splitting-while-merging framework for gene clustering
Copyright @ 2014 Fa et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium, provided the original author and source are credited.Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named “splitting merging awareness tactics” (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.National Institute for Health Researc
Paradigm of tunable clustering using binarization of consensus partition matrices (Bi-CoPaM) for gene discovery
Copyright @ 2013 Abu-Jamous et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.National Institute for Health Researc
Beyond element-wise interactions: identifying complex interactions in biological processes
Background: Biological processes typically involve the interactions of a number of elements (genes, cells) acting on each others. Such processes are often modelled as networks whose nodes are the elements in question and edges pairwise relations between them (transcription, inhibition). But more often than not, elements actually work cooperatively or competitively to achieve a task. Or an element can act on the interaction between two others, as in the case of an enzyme controlling a reaction rate. We call “complex” these types of interaction and propose ways to identify them from time-series observations.
Methodology: We use Granger Causality, a measure of the interaction between two signals, to characterize the influence of an enzyme on a reaction rate. We extend its traditional formulation to the case of multi-dimensional signals in order to capture group interactions, and not only element interactions. Our method is extensively tested on simulated data and applied to three biological datasets: microarray data of the Saccharomyces cerevisiae yeast, local field potential recordings of two brain areas and a metabolic reaction.
Conclusions: Our results demonstrate that complex Granger causality can reveal new types of relation between signals and is particularly suited to biological data. Our approach raises some fundamental issues of the systems biology approach since finding all complex causalities (interactions) is an NP hard problem
Cyclebase.org: version 2.0, an updated comprehensive, multi-species repository of cell cycle experiments and derived analysis results
Cell division involves a complex series of events orchestrated by thousands of molecules. To study this process, researchers have employed mRNA expression profiling of synchronously growing cell cultures progressing through the cell cycle. These experiments, which have been carried out in several organisms, are not easy to access, combine and evaluate. Complicating factors include variation in interdivision time between experiments and differences in relative duration of each cell-cycle phase across organisms. To address these problems, we created Cyclebase, an online resource of cell-cycle-related experiments. This database provides an easy-to-use web interface that facilitates visualization and download of genome-wide cell-cycle data and analysis results. Data from different experiments are normalized to a common timescale and are complimented with key cell-cycle information and derived analysis results. In Cyclebase version 2.0, we have updated the entire database to reflect changes to genome annotations, included information on cyclin-dependent kinase (CDK) substrates, predicted degradation signals and loss-of-function phenotypes from genome-wide screens. The web interface has been improved and provides a single, gene-centric graph summarizing the available cell-cycle experiments. Finally, key information and links to orthologous and paralogous genes are now included to further facilitate comparison of cell-cycle regulation across species. Cyclebase version 2.0 is available at http://www.cyclebase.org
Genotype List String: a grammar for describing HLA and KIR genotyping results in a text string
Knowledge of an individual's human leukocyte antigen (HLA) genotype is essential for modern medical genetics, and is crucial for hematopoietic stem cell and solid-organ transplantation. However, the high levels of polymorphism known for the HLA genes make it difficult to generate an HLA genotype that unambiguously identifies the alleles that are present at a given HLA locus in an individual. For the last 20 years, the histocompatibility and immunogenetics community has recorded this HLA genotyping ambiguity using allele codes developed by the National Marrow Donor Program (NMDP). While these allele codes may have been effective for recording an HLA genotyping result when initially developed, their use today results in increased ambiguity in an HLA genotype, and they are no longer suitable in the era of rapid allele discovery and ultra-high allele polymorphism. Here, we present a text string format capable of fully representing HLA genotyping results. This Genotype List (GL) String format is an extension of a proposed standard for reporting killer-cell immunoglobulin-like receptor (KIR) genotype data that can be applied to any genetic data that use a standard nomenclature for identifying variants. The GL String format uses a hierarchical set of operators to describe the relationships between alleles, lists of possible alleles, phased alleles, genotypes, lists of possible genotypes, and multilocus unphased genotypes, without losing typing information or increasing typing ambiguity. When used in concert with appropriate tools to create, exchange, and parse these strings, we anticipate that GL Strings will replace NMDP allele codes for reporting HLA genotypes
- …
