Search CORE

UNIL IRIS | Institutional Research Information System

The Francis Crick Institute

The branch-site test of positive selection is surprisingly robust but lacks power under synonymous substitution saturation and variation in GC.

Author: Gharib W.H.
Robinson-Rechavi M.
Publication venue
Publication date: 01/01/2013
Field of study

Positive selection is widely estimated from protein coding sequence alignments by the nonsynonymous-to-synonymous ratio ω. Increasingly elaborate codon models are used in a likelihood framework for this estimation. Although there is widespread concern about the robustness of the estimation of the ω ratio, more efforts are needed to estimate this robustness, especially in the context of complex models. Here, we focused on the branch-site codon model. We investigated its robustness on a large set of simulated data. First, we investigated the impact of sequence divergence. We found evidence of underestimation of the synonymous substitution rate for values as small as 0.5, with a slight increase in false positives for the branch-site test. When dS increases further, underestimation of dS is worse, but false positives decrease. Interestingly, the detection of true positives follows a similar distribution, with a maximum for intermediary values of dS. Thus, high dS is more of a concern for a loss of power (false negatives) than for false positives of the test. Second, we investigated the impact of GC content. We showed that there is no significant difference of false positives between high GC (up to ∼80%) and low GC (∼30%) genes. Moreover, neither shifts of GC content on a specific branch nor major shifts in GC along the gene sequence generate many false positives. Our results confirm that the branch-site is a very conservative test

UNIL IRIS | Institutional Research Information System

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Age-dependent gain of alternative splice forms and biased duplication explain the relation between splicing and duplication.

Author: Robinson-Rechavi M.
Roux J.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 20/12/2010
Field of study

We analyze here the relation between alternative splicing and gene duplication in light of recent genomic data, with a focus on the human genome. We show that the previously reported negative correlation between level of alternative splicing and family size no longer holds true. We clarify this pattern and show that it is sufficiently explained by two factors. First, genes progressively gain new splice variants with time. The gain is consistent with a selectively relaxed regime, until purifying selection slows it down as aging genes accumulate a large number of variants. Second, we show that duplication does not lead to a loss of splice forms, but rather that genes with low levels of alternative splicing tend to duplicate more frequently. This leads us to reconsider the role of alternative splicing in duplicate retention

UNIL IRIS | Institutional Research Information System

Tissue-Specific Evolution of Protein Coding Genes in Human and Mouse.

Author: Kryuchkova-Mostacci N.
Robinson-Rechavi M.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Protein-coding genes evolve at different rates, and the influence of different parameters, from gene size to expression level, has been extensively studied. While in yeast gene expression level is the major causal factor of gene evolutionary rate, the situation is more complex in animals. Here we investigate these relations further, especially taking in account gene expression in different organs as well as indirect correlations between parameters. We used RNA-seq data from two large datasets, covering 22 mouse tissues and 27 human tissues. Over all tissues, evolutionary rate only correlates weakly with levels and breadth of expression. The strongest explanatory factors of purifying selection are GC content, expression in many developmental stages, and expression in brain tissues. While the main component of evolutionary rate is purifying selection, we also find tissue-specific patterns for sites under neutral evolution and for positive selection. We observe fast evolution of genes expressed in testis, but also in other tissues, notably liver, which are explained by weak purifying selection rather than by positive selection

UNIL IRIS | Institutional Research Information System

The Francis Crick Institute

Contribution of electrostatic interactions, compactness and quaternary structure to protein thermostability: lessons from structural genomics of Thermotoga maritima.

Author: Alibés A.
Godzik A.
Robinson-Rechavi M.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Studies of the structural basis of protein thermostability have produced a confusing picture. Small sets of proteins have been analyzed from a variety of thermophilic species, suggesting different structural features as responsible for protein thermostability. Taking advantage of the recent advances in structural genomics, we have compiled a relatively large protein structure dataset, which was constructed very carefully and selectively; that is, the dataset contains only experimentally determined structures of proteins from one specific organism, the hyperthermophilic bacterium Thermotoga maritima, and those of close homologs from mesophilic bacteria. In contrast to the conclusions of previous studies, our analyses show that oligomerization order, hydrogen bonds, and secondary structure play minor roles in adaptation to hyperthermophily in bacteria. On the other hand, the data exhibit very significant increases in the density of salt-bridges and in compactness for proteins from T.maritima. The latter effect can be measured by contact order or solvent accessibility, and network analysis shows a specific increase in highly connected residues in this thermophile. These features account for changes in 96% of the protein pairs studied. Our results provide a clear picture of protein thermostability in one species, and a framework for future studies of thermal adaptation

UNIL IRIS | Institutional Research Information System

No evidence for the radiation time lag model after whole genome duplications in Teleostei.

Author: Laurent S.
Robinson-Rechavi M.
Salamin N.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

The short and long term effects of polyploidization on the evolutionary fate of lineages is still unclear despite much interest. First recognized in land plants, it has become clear that polyploidization is widespread in eukaryotes, notably at the origin of vertebrates and teleost fishes. Many hypotheses have been proposed to link the species richness of lineages and whole genome duplications. For instance, the radiation time lag model suggests that paleopolyploidy would favour the apparition of new phenotypic traits, although the radiation of the lineage would not occur before a later dispersion event. Some results indicate that this model may be observed during land plant evolution. In this work, we test predictions of the radiation time lag model using both fossil data and molecular phylogenies in ancient and more recent teleost whole genome duplications. We fail to find any evidence of delayed increase of the species number after any of these events and conclude that paleopolyploidization still remains to be unambiguously linked to taxonomic diversity in teleosts

UNIL IRIS | Institutional Research Information System

Tissue-Specificity of Gene Expression Diverges Slowly between Orthologs, and Rapidly between Paralogs.

Author: Kryuchkova-Mostacci N.
Robinson-Rechavi M.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

The ortholog conjecture implies that functional similarity between orthologous genes is higher than between paralogs. It has been supported using levels of expression and Gene Ontology term analysis, although the evidence was rather weak and there were also conflicting reports. In this study on 12 species we provide strong evidence of high conservation in tissue-specificity between orthologs, in contrast to low conservation between within-species paralogs. This allows us to shed a new light on the evolution of gene expression patterns. While there have been several studies of the correlation of expression between species, little is known about the evolution of tissue-specificity itself. Ortholog tissue-specificity is strongly conserved between all tetrapod species, with the lowest Pearson correlation between mouse and frog at r = 0.66. Tissue-specificity correlation decreases strongly with divergence time. Paralogs in human show much lower conservation, even for recent Primate-specific paralogs. When both paralogs from ancient whole genome duplication tissue-specific paralogs are tissue-specific, it is often to different tissues, while other tissue-specific paralogs are mostly specific to the same tissue. The same patterns are observed using human or mouse as focal species, and are robust to choices of datasets and of thresholds. Our results support the following model of evolution: in the absence of duplication, tissue-specificity evolves slowly, and tissue-specific genes do not change their main tissue of expression; after small-scale duplication the less expressed paralog loses the ancestral specificity, leading to an immediate difference between paralogs; over time, both paralogs become more broadly expressed, but remain poorly correlated. Finally, there is a small number of paralog pairs which stay tissue-specific with the same main tissue of expression, for at least 300 million years

UNIL IRIS | Institutional Research Information System

The Francis Crick Institute

What to compare and how: Comparative transcriptomics for Evo-Devo.

Author: Robinson-Rechavi M.
Rosikiewicz M.
Roux J.
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

Evolutionary developmental biology has grown historically from the capacity to relate patterns of evolution in anatomy to patterns of evolution of expression of specific genes, whether between very distantly related species, or very closely related species or populations. Scaling up such studies by taking advantage of modern transcriptomics brings promising improvements, allowing us to estimate the overall impact and molecular mechanisms of convergence, constraint or innovation in anatomy and development. But it also presents major challenges, including the computational definitions of anatomical homology and of organ function, the criteria for the comparison of developmental stages, the annotation of transcriptomics data to proper anatomical and developmental terms, and the statistical methods to compare transcriptomic data between species to highlight significant conservation or changes. In this article, we review these challenges, and the ongoing efforts to address them, which are emerging from bioinformatics work on ontologies, evolutionary statistics, and data curation, with a focus on their implementation in the context of the development of our database Bgee (http://bgee.org). J. Exp. Zool. (Mol. Dev. Evol.) 324B: 372-382, 2015. © 2015 Wiley Periodicals, Inc

UNIL IRIS | Institutional Research Information System

State aggregation for fast likelihood computations in molecular evolution.

Author: Davydov I.I.
Robinson-Rechavi M.
Salamin N.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

MOTIVATION: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large size of the state space of the Markov processes used to model codon evolution makes it difficult to use these models with large biological datasets. We propose here to use state aggregation to reduce the state space of codon models and, thus, improve the computational performance of likelihood estimation on these models. RESULTS: We show that this heuristic speeds up the computations of the M0 and branch-site models up to 6.8 times. We also show through simulations that state aggregation does not introduce a detectable bias. We analysed a real dataset and show that aggregation provides highly correlated predictions compared to the full likelihood computations. Finally, state aggregation is a very general approach and can be applied to any continuous-time Markov process-based model with large state space, such as amino acid and coevolution models. We therefore discuss different ways to apply state aggregation to Markov models used in phylogenetics. AVAILABILITY: The heuristic is implemented in the godon package (https://bitbucket.org/Davydov/godon) and in a version of FastCodeML (https://gitlab.isb-sib.ch/phylo/fastcodeml). CONTACT: [email protected] SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

UNIL IRIS | Institutional Research Information System

Selectome: a database of positive selection.

Author: Moretti S.
Proux E.
Robinson-Rechavi M.
Studer R.A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/10/2008
Field of study

Genome wide scans have shown that positive selection is relatively frequent at the molecular level. It is of special interest to identify which protein sites and which phylogenetic branches are affected. We present Selectome, a database which provides the results of a rigorous branch-site specific likelihood test for positive selection. The Web interface presents test results mapped both onto phylogenetic trees and onto protein alignments. It allows rapid access to results by keyword, gene name, or taxonomy based queries. Selectome is freely available at http://bioinfo.unil.ch/selectome/

UNIL IRIS | Institutional Research Information System