957 research outputs found
Temperature-dependent benefits of bacterial exposure in embryonic development of Daphnia magna resting eggs
The environments in which animals develop and evolve are profoundly shaped by bacteria, which affect animals both indirectly through their role in biogeochemical processes and directly through antagonistic or beneficial interactions. The outcomes of these activities can differ according to environmental context. In a series of laboratory experiments with diapausing eggs of the water flea Daphnia magna, we manipulated two environmental parameters, temperature and presence of bacteria, and examined their effect on development. At elevated temperatures (≥ 26 °C), resting eggs developing without live bacteria had reduced hatching success and correspondingly higher rates of severe morphological abnormalities compared with eggs with bacteria in their environment. The beneficial effect of bacteria was strongly reduced at 20 °C. Neither temperature nor the presence of bacteria affected directly developing parthenogenetic eggs. The mechanistic basis of this effect of bacteria on development is unclear, but these results highlight the complex interplay of biotic and abiotic factors influencing animal development after diapause
Approximating Weighted Duo-Preservation in Comparative Genomics
Motivated by comparative genomics, Chen et al. [9] introduced the Maximum
Duo-preservation String Mapping (MDSM) problem in which we are given two
strings and from the same alphabet and the goal is to find a
mapping between them so as to maximize the number of duos preserved. A
duo is any two consecutive characters in a string and it is preserved in the
mapping if its two consecutive characters in are mapped to same two
consecutive characters in . The MDSM problem is known to be NP-hard and
there are approximation algorithms for this problem [3, 5, 13], but all of them
consider only the "unweighted" version of the problem in the sense that a duo
from is preserved by mapping to any same duo in regardless of their
positions in the respective strings. However, it is well-desired in comparative
genomics to find mappings that consider preserving duos that are "closer" to
each other under some distance measure [19]. In this paper, we introduce a
generalized version of the problem, called the Maximum-Weight Duo-preservation
String Mapping (MWDSM) problem that captures both duos-preservation and
duos-distance measures in the sense that mapping a duo from to each
preserved duo in has a weight, indicating the "closeness" of the two
duos. The objective of the MWDSM problem is to find a mapping so as to maximize
the total weight of preserved duos. In this paper, we give a polynomial-time
6-approximation algorithm for this problem.Comment: Appeared in proceedings of the 23rd International Computing and
Combinatorics Conference (COCOON 2017
50 years of the International Committee on Taxonomy of Viruses: progress and prospects
We mark the 50th anniversary of the International Committee on Taxonomy of Viruses (ICTV) by presenting a brief history of the organization since its foundation, showing how it has adapted to advancements in our knowledge of virus diversity and the methods used to characterize it. We also outline recent developments, supported by a grant from the Wellcome Trust (UK), that are facilitating substantial changes in the operations of the ICTV and promoting dialogue with the virology community. These developments will generate improved online resources, including a freely available and regularly updated ICTV Virus Taxonomy Report. They also include a series of meetings between the ICTV and the broader community focused on some of the major challenges facing virus taxonomy, with the outcomes helping to inform the future policy and practice of the ICTV
Florigen and its homologs of FT/CETS/PEBP/RKIP/YbhB family may be the enzymes of small molecule metabolism: review of the evidence.
BACKGROUND: Flowering signals are sensed in plant leaves and transmitted to the shoot apical meristems, where the formation of flowers is initiated. Searches for a diffusible hormone-like signaling entity ("florigen") went on for many decades, until a product of plant gene FT was identified as the key component of florigen in the 1990s, based on the analysis of mutants, genetic complementation evidence, and protein and RNA localization studies. Sequence homologs of FT protein are found throughout prokaryotes and eukaryotes; some eukaryotic family members appear to bind phospholipids or interact with the components of the signal transduction cascades. Most FT homologs are known to share a constellation of five charged residues, three of which, i.e., two histidines and an aspartic acid, are located at the rim of a well-defined cavity on the protein surface. RESULTS: We studied molecular features of the FT homologs in prokaryotes and analyzed their genome context, to find tentative evidence connecting the bacterial FT homologs with small molecule metabolism, often involving substrates that contain sugar or ribonucleoside moieties. We argue that the unifying feature of this protein family, i.e., a set of charged residues conserved at the sequence and structural levels, is more likely to be an enzymatic active center than a catalytically inert ligand-binding site. CONCLUSIONS: We propose that most of FT-related proteins are enzymes operating on small diffusible molecules. Those metabolites may constitute an overlooked essential ingredient of the florigen signal
Measuring gene expression divergence: the distance to keep
<p>Abstract</p> <p>Background</p> <p>Gene expression divergence is a phenotypic trait reflecting evolution of gene regulation and characterizing dissimilarity between species and between cells and tissues within the same species. Several distance measures, such as Euclidean and correlation-based distances have been proposed for measuring expression divergence.</p> <p>Results</p> <p>We show that different distance measures identify different trends in gene expression patterns. When comparing orthologous genes in eight rat and human tissues, the Euclidean distance identified genes uniformly expressed in all tissues near the expression background as genes with the most conserved expression pattern. In contrast, correlation-based distance and generalized-average distance identified genes with concerted changes among homologous tissues as those most conserved. On the other hand, correlation-based distance, Euclidean distance and generalized-average distance highlight quite well the relatively high similarity of gene expression patterns in homologous tissues between species, compared to non-homologous tissues within species.</p> <p>Conclusions</p> <p>Different trends exist in the high-dimensional numeric data, and to highlight a particular trend an appropriate distance measure needs to be chosen. The choice of the distance measure for measuring expression divergence can be dictated by the expression patterns that are of interest in a particular study.</p> <p>Reviewers</p> <p>This article was reviewed by Mikhail Gelfand, Eugene Koonin and Subhajyoti De (nominated by Sarah Teichmann).</p
A topological algorithm for identification of structural domains of proteins
<p>Abstract</p> <p>Background</p> <p>Identification of the structural domains of proteins is important for our understanding of the organizational principles and mechanisms of protein folding, and for insights into protein function and evolution. Algorithmic methods of dissecting protein of known structure into domains developed so far are based on an examination of multiple geometrical, physical and topological features. Successful as many of these approaches are, they employ a lot of heuristics, and it is not clear whether they illuminate any deep underlying principles of protein domain organization. Other well-performing domain dissection methods rely on comparative sequence analysis. These methods are applicable to sequences with known and unknown structure alike, and their success highlights a fundamental principle of protein modularity, but this does not directly improve our understanding of protein spatial structure.</p> <p>Results</p> <p>We present a novel graph-theoretical algorithm for the identification of domains in proteins with known three-dimensional structure. We represent the protein structure as an undirected, unweighted and unlabeled graph whose nodes correspond to the secondary structure elements and edges represent physical proximity of at least one pair of alpha carbon atoms from two elements. Domains are identified as constrained partitions of the graph, corresponding to sets of vertices obtained by the maximization of the cycle distributions found in the graph. When a partition is found, the algorithm is iteratively applied to each of the resulting subgraphs. The decision to accept or reject a tentative cut position is based on a specific classifier. The algorithm is applied iteratively to each of the resulting subgraphs and terminates automatically if partitions are no longer accepted. The distribution of cycles is the only type of information on which the decision about protein dissection is based. Despite the barebone simplicity of the approach, our algorithm approaches the best heuristic algorithms in accuracy.</p> <p>Conclusion</p> <p>Our graph-theoretical algorithm uses only topological information present in the protein structure itself to find the domains and does not rely on any geometrical or physical information about protein molecule. Perhaps unexpectedly, these drastic constraints on resources, which result in a seemingly approximate description of protein structures and leave only a handful of parameters available for analysis, do not lead to any significant deterioration of algorithm accuracy. It appears that protein structures can be rigorously treated as topological rather than geometrical objects and that the majority of information about protein domains can be inferred from the coarse-grained measure of pairwise proximity between elements of secondary structure elements.</p
Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock
While genome-wide gene expression data are generated at an increasing rate, the repertoire of approaches for pattern discovery in these data is still limited. Identifying subtle patterns of interest in large amounts of data (tens of thousands of profiles) associated with a certain level of noise remains a challenge. A microarray time series was recently generated to study the transcriptional program of the mouse segmentation clock, a biological oscillator associated with the periodic formation of the segments of the body axis. A method related to Fourier analysis, the Lomb-Scargle periodogram, was used to detect periodic profiles in the dataset, leading to the identification of a novel set of cyclic genes associated with the segmentation clock. Here, we applied to the same microarray time series dataset four distinct mathematical methods to identify significant patterns in gene expression profiles. These methods are called: Phase consistency, Address reduction, Cyclohedron test and Stable persistence, and are based on different conceptual frameworks that are either hypothesis- or data-driven. Some of the methods, unlike Fourier transforms, are not dependent on the assumption of periodicity of the pattern of interest. Remarkably, these methods identified blindly the expression profiles of known cyclic genes as the most significant patterns in the dataset. Many candidate genes predicted by more than one approach appeared to be true positive cyclic genes and will be of particular interest for future research. In addition, these methods predicted novel candidate cyclic genes that were consistent with previous biological knowledge and experimental validation in mouse embryos. Our results demonstrate the utility of these novel pattern detection strategies, notably for detection of periodic profiles, and suggest that combining several distinct mathematical approaches to analyze microarray datasets is a valuable strategy for identifying genes that exhibit novel, interesting transcriptional patterns
Similarity searches in genome-wide numerical data sets
We present psi-square, a program for searching the space of gene vectors. The program starts with a gene vector, i.e., the set of measurements associated with a gene, and finds similar vectors, derives a probabilistic model of these vectors, then repeats search using this model as a query, and continues to update the model and search again, until convergence. When applied to three different pathway-discovery problems, psi-square was generally more sensitive and sometimes more specific than the ad hoc methods developed for solving each of these problems before. REVIEWERS: This article was reviewed by King Jordan, Mikhail Gelfand, Nicolas Galtier and Sarah Teichmann
- …
