Search CORE

228 research outputs found

Detecting and comparing non-coding RNAs in the high-throughput era.

Author: Bussotti Giovanni
Enright Anton J
Notredame Cedric
Publication venue: Int J Mol Sci
Publication date: 01/01/2013
Field of study

In recent years there has been a growing interest in the field of non-coding RNA. This surge is a direct consequence of the discovery of a huge number of new non-coding genes and of the finding that many of these transcripts are involved in key cellular functions. In this context, accurately detecting and comparing RNA sequences has become important. Aligning nucleotide sequences is a key requisite when searching for homologous genes. Accurate alignments reveal evolutionary relationships, conserved regions and more generally any biologically relevant pattern. Comparing RNA molecules is, however, a challenging task. The nucleotide alphabet is simpler and therefore less informative than that of amino-acids. Moreover for many non-coding RNAs, evolution is likely to be mostly constrained at the structural level and not at the sequence level. This results in very poor sequence conservation impeding comparison of these molecules. These difficulties define a context where new methods are urgently needed in order to exploit experimental results to their full potential. This review focuses on the comparative genomics of non-coding RNAs in the context of new sequencing technologies and especially dealing with two extremely important and timely research aspects: the development of new methods to align RNAs and the analysis of high-throughput data

Multidisciplinary Digital Publishing Institute

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee

Author: Armougom Fabrice
Audic Stéphane
Dumas Pierre
Keduas Vladimir
Moretti Sébastien
Notredame Cedric
Poirot Olivier
Schaeli Basile
Publication venue
Publication date: 02/08/2017
Field of study

Expresso is a multiple sequence alignment server that aligns sequences using structural information. The user only needs to provide sequences. The server runs BLAST to identify close homologues of the sequences within the PDB database. These PDB structures are used as templates to guide the alignment of the original sequences using structure-based sequence alignment methods like SAP or Fugue. The final result is a multiple sequence alignment of the original sequences based on the structural information of the templates. An advanced mode makes it possible to either upload private structures or specify which PDB templates should be used to model each sequence. Providing the suitable structural information is available, Expresso delivers sequence alignments with accuracy comparable with structure-based alignments. The server is available on http://www.tcoffee.or

RERO DOC Digital Library

BlastR—fast and accurate database searches for non-coding RNAs

Author: Beaudoing Emmanuel
Bucher Philipp
Bussotti Giovanni
Erb Ionas
Notredame Cedric
Raineri Emanuele
Wilm Andreas
Zytnicki Matthias
Publication venue
Publication date: 02/08/2017
Field of study

We present and validate BlastR, a method for efficiently and accurately searching non-coding RNAs. Our approach relies on the comparison of di-nucleotides using BlosumR, a new log-odd substitution matrix. In order to use BlosumR for comparison, we recoded RNA sequences into protein-like sequences. We then showed that BlosumR can be used along with the BlastP algorithm in order to search non-coding RNA sequences. Using Rfam as a gold standard, we benchmarked this approach and show BlastR to be more sensitive than BlastN. We also show that BlastR is both faster and more sensitive than BlastP used with a single nucleotide log-odd substitution matrix. BlastR, when used in combination with WU-BlastP, is about 5% more accurate than WU-BlastN and about 50 times slower. The approach shown here is equally effective when combined with the NCBI-Blast package. The software is an open source freeware available from www.tcoffee.org/blastr.htm

RERO DOC Digital Library

T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension

Author: Chang Jia-Ming
Di Tommaso Paolo
Montanyola Alberto
Moretti Sebastien
Notredame Cedric
Orobitg Miquel
Taly Jean-François
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10 000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.ca

RERO DOC Digital Library

Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud

Author: Cores Prado Fernando
Di Tommaso Paolo
Espinosa Toni
Guirado Fernández Fernando
Notredame Cedric
Orobitg Cortada Miquel
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2010
Field of study

Summary: We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-house deployment.Centro de Regulación Genómica (CRG to C.N., P.T., C.K.); Plan Nacional (BFU2008-00419); Lleida University, the Spanish ministry of education (TIN2008-05913 to M.O., F.G., F.C.); the Consolider Project (CSD 2007-00050); Super-computacion y e-Cienia (SYEC)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

PubMed Central

Repositori Obert UdL

Diposit Digital de Documents de la UAB

Lessons Learned: Recommendations for Establishing Critical Periodic Scientific Benchmarking

Author: Capella-Gutierrez Salvador
de la Iglesia Diana
Dessimoz Christophe
Fernandez José M.
Gelpí Josep Lluís
Haas Juergen
Lourenco Analia
Notredame Cedric
Repchevsky Dmitry
Schwede Torsten
Valencia Alfonso
Publication venue
Publication date: 01/01/2017
Field of study

The dependence of life scientists on software has steadily grown in recent years. For many tasks, researchers have to decide which of the available bioinformatics software are more suitable for their specific needs. Additionally researchers should be able to objectively select the software that provides the highest accuracy, the best efficiency and the highest level of reproducibility when integrated in their research projects. Critical benchmarking of bioinformatics methods, tools and web services is therefore an essential community service, as well as a critical component of reproducibility efforts. Unbiased and objective evaluations are challenging to set up and can only be effective when built and implemented around community driven efforts, as demonstrated by the many ongoing community challenges in bioinformatics that followed the success of CASP. Community challenges bring the combined benefits of intense collaboration, transparency and standard harmonization. Only open systems for the continuous evaluation of methods offer a perfect complement to community challenges, offering to larger communities of users that could extend far beyond the community of developers, a window to the developments status that they can use for their specific projects. We understand by continuous evaluation systems as those services which are always available and periodically update their data and/or metrics according to a predefined schedule keeping in mind that the performance has to be always seen in terms of each research domain. We argue here that technology is now mature to bring community driven benchmarking efforts to a higher level that should allow effective interoperability of benchmarks across related methods. New technological developments allow overcoming the limitations of the first experiences on online benchmarking e.g. EVA. We therefore describe OpenEBench, a novel infra-structure designed to establish a continuous automated benchmarking system for bioinformatics methods, tools and web services. OpenEBench is being developed so as to cater for the needs of the bioinformatics community, especially software developers who need an objective and quantitative way to inform their decisions as well as the larger community of end-users, in their search for unbiased and up-to-date evaluation of bioinformatics methods. As such OpenEBench should soon become a central place for bioinformatics software developers, community-driven benchmarking initiatives, researchers using bioinformatics methods, and funders interested in the result of methods evaluation.Preprin

UPCommons. Portal del coneixement obert de la UPC

STRIKE: evaluation of protein MSAs using a single 3D structure

Author: Kemena Carsten
Kleinjung Jens
Notredame Cedric
Taly Jean-Francois
Publication venue: Oxford University Press
Publication date: 28/10/2011
Field of study

Motivation: Evaluating alternative multiple protein sequence alignments is an important unsolved problem in Biology. The most accurate way of doing this is to use structural information. Unfortunately, most methods require at least two structures to be embedded in the alignment, a condition rarely met when dealing with standard datasets

Crossref

PubMed Central

M-Coffee: combining multiple sequence alignment methods with T-Coffee

Author: Higgins Desmond G.
Notredame Cedric
O'Sullivan Orla
Wallace Iain M.
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

We introduce M-Coffee, a meta-method for assembling multiple sequence alignments (MSA) by combining the output of several individual methods into one single MSA. M-Coffee is an extension of T-Coffee and uses consistency to estimate a consensus alignment. We show that the procedure is robust to variations in the choice of constituent methods and reasonably tolerant to duplicate MSAs. We also show that performances can be improved by carefully selecting the constituent methods. M-Coffee outperforms all the individual methods on three major reference datasets: HOMSTRAD, Prefab and Balibase. We also show that on a case-by-case basis, M-Coffee is twice as likely to deliver the best alignment than any individual method. Given a collection of pre-computed MSAs, M-Coffee has similar CPU requirements to the original T-Coffee. M-Coffee is a freeware open-source package available from

CiteSeerX

Crossref

PubMed Central

Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee

Author: Chang Jia-Ming
Di Tommaso Paolo
Notredame Cedric
Taly Jean-François
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Transmembrane proteins (TMPs) constitute about 20~30% of all protein coding genes. The relative lack of experimental structure has so far made it hard to develop specific alignment methods and the current state of the art (PRALINE™) only manages to recapitulate 50% of the positions in the reference alignments available from the BAliBASE2-ref7. Methods We show how homology extension can be adapted and combined with a consistency based approach in order to significantly improve the multiple sequence alignment of alpha-helical TMPs. TM-Coffee is a special mode of PSI-Coffee able to efficiently align TMPs, while using a reduced reference database for homology extension. Results Our benchmarking on BAliBASE2-ref7 alpha-helical TMPs shows a significant improvement over the most accurate methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. We also estimated the influence of the database used for homology extension and show that highly non-redundant UniRef databases can be used to obtain similar results at a significantly reduced computational cost over full protein databases. TM-Coffee is part of the T-Coffee package, a web server is also available from <url>http://tcoffee.crg.cat/tmcoffee</url> and a freeware open source code can be downloaded from <url>http://www.tcoffee.org/Packages/Stable/Latest</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central