Search CORE

162 research outputs found

Draught risk index tool for building energy simulations

Author: Jensen Rasmus Lund
Nielsen Peter V.
Vorre Mette Havgaard
Publication venue
Publication date: 01/10/2014
Field of study

VBN (Videnbasen) Aalborg Universitets forskningsportal

Structural alignment of RNA with FOLDALIGN

Author: Havgaard Jakob Hull
Publication venue: Center for Skov, Landskab og Planlægning/Københavns Universitet
Publication date: 01/01/2007
Field of study

Copenhagen University Research Information System

Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment

Author: A Löytynoja
A Löytynoja
B Sipos
BG Hall
BG Hall
BP Blackburne
C Chothia
C Dessimoz
C Kemena
C Kemena
C Notredame
CB Do
CL Strope
DA Dalquen
DA Morrison
DH Mathews
ER Mardis
G Blackshields
G Jordan
G Landan
GP Raghava
I Walle Van
J Kim
J Stoye
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JD Thompson
JH Havgaard
JP Huelsenbeck
K Mizuguchi
LA Stebbings
M Anisimova
M Pop
MR Aniba
P Gardner
RA Cartwright
RB Russell
RC Edgar
RC Edgar
SA Berger
SF Altschul
T Golubchik
T Koestler
T Lassmann
T Lassmann
T Lassmann
W Fletcher
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/11/2012
Field of study

Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique in bioinformatics used to infer related residues among biological sequences. Thus alignment accuracy is crucial to a vast range of analyses, often in ways difficult to assess in those analyses. To compare the performance of different aligners and help detect systematic errors in alignments, a number of benchmarking strategies have been pursued. Here we present an overview of the main strategies--based on simulation, consistency, protein structure, and phylogeny--and discuss their different advantages and associated risks. We outline a set of desirable characteristics for effective benchmarking, and evaluate each strategy in light of them. We conclude that there is currently no universally applicable means of benchmarking MSA, and that developers and users of alignment tools should base their choice of benchmark depending on the context of application--with a keen awareness of the assumptions underlying each benchmarking strategy.Comment: Revie

arXiv.org e-Print Archive

Crossref

UCL Discovery

Using building simulation to evaluate thermal comfort

Author: Vorre Mette Havgaard
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2015
Field of study

VBN (Videnbasen) Aalborg Universitets forskningsportal

The PETfold and PETcofold web servers for intra- and intermolecular structures of multiple RNA sequences

Author: Arthur
Gorodkin
Havgaard
Havgaard
J. Gorodkin
Knudsen
Kolb
Mattick
P. Menzel
R. Backofen
S. E. Seemann
Schneider
Seemann
Will
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

The function of non-coding RNA genes largely depends on their secondary structure and the interaction with other molecules. Thus, an accurate prediction of secondary structure and RNA–RNA interaction is essential for the understanding of biological roles and pathways associated with a specific RNA gene. We present web servers to analyze multiple RNA sequences for common RNA structure and for RNA interaction sites. The web servers are based on the recent PET (Probabilistic Evolutionary and Thermodynamic) models PETfold and PETcofold, but add user friendly features ranging from a graphical layer to interactive usage of the predictors. Additionally, the web servers provide direct access to annotated RNA alignments, such as the Rfam 10.0 database and multiple alignments of 16 vertebrate genomes with human. The web servers are freely available at: http://rth.dk/resources/petfold

Crossref

FreiDok plus

PubMed Central

Copenhagen University Research Information System

Simultaneous alignment and folding of protein sequences

Author: A. Caprara
B.E. Shakhnovich
C.B. Do
C.B. Do
D. Frishman
D. Sankoff
D.H. Mathews
G. Raghava
I.L. Hofacker
J. Selbig
J. Waldispuhl
J. Waldispuhl
J.H. Havgaard
L.R. Forrest
M. Brudno
M. Cline
M. Lomize
M. Menke
P. Bradley
P. Fariselli
P. Rice
R. Backofen
R. Doolittle
R.A. Sutormin
R.C. Edgar
R.C. Edgar
R.C. Edgar
R.L.J. Dunbrack
S. Henikoff
S. Will
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We presentpartiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm’s complexity is polynomial in time and space. Algorithmically,partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments,partiFold-Align significantly outperforms state-of-the-art pairwise sequence alignment tools in the most difficult low sequence homology case and improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families. partiFold-Align is available at http://partiFold.csail.mit.edu

Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix

Author: David Mathews
Elfar Torarinsson
Jakob H Havgaard
Jan Gorodkin
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

It has become clear that noncoding RNAs (ncRNA) play important roles in cells, and emerging studies indicate that there might be a large number of unknown ncRNAs in mammalian genomes. There exist computational methods that can be used to search for ncRNAs by comparing sequences from different genomes. One main problem with these methods is their computational complexity, and heuristics are therefore employed. Two heuristics are currently very popular: pre-folding and pre-aligning. However, these heuristics are not ideal, as pre-aligning is dependent on sequence similarity that may not be present and pre-folding ignores the comparative information. Here, pruning of the dynamical programming matrix is presented as an alternative novel heuristic constraint. All subalignments that do not exceed a length-dependent minimum score are discarded as the matrix is filled out, thus giving the advantage of providing the constraints dynamically. This has been included in a new implementation of the FOLDALIGN algorithm for pairwise local or global structural alignment of RNA sequences. It is shown that time and memory requirements are dramatically lowered while overall performance is maintained. Furthermore, a new divide and conquer method is introduced to limit the memory requirement during global alignment and backtrack of local alignment. All branch points in the computed RNA structure are found and used to divide the structure into smaller unbranched segments. Each segment is then realigned and backtracked in a normal fashion. Finally, the FOLDALIGN algorithm has also been updated with a better memory implementation and an improved energy model. With these improvements in the algorithm, the FOLDALIGN software package provides the molecular biologist with an efficient and user-friendly tool for searching for new ncRNAs. The software package is available for download at http://foldalign.ku.dk

Crossref

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

The Francis Crick Institute

Foldalign 2.5:multithreaded implementation for pairwise structural RNA alignment

Author: de Melo Alba C. M. A.
Gorodkin Jan
Havgaard Jakob Hull
Sundfeld Daniel
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/12/2015
Field of study

Motivation: Structured RNAs can be hard to search for as they often are not well conserved in their primary structure and are local in their genomic or transcriptomic context. Thus, the need for tools which in particular can make local structural alignments of RNAs is only increasing. Results: To meet the demand for both large-scale screens and hands on analysis through web servers, we present a new multithreaded version of Foldalign. We substantially improve execution time while maintaining all previous functionalities, including carrying out local structural alignments of sequences with low similarity. Furthermore, the improvements allow for comparing longer RNAs and increasing the sequence length. For example, lengths in the range 2000–6000 nucleotides improve execution up to a factor of five. Availability and implementation: The Foldalign software and the web server are available at http://rth.dk/resources/foldalign Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

Copenhagen University Research Information System

PubMed Central

Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags.

RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.BACKGROUND: Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. RESULTS: Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. CONCLUSION: This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies.Published versio

Crossref

Springer - Publisher Connector

PubMed Central

Copenhagen University Research Information System

Syddansk Universitets Forskerportal

Apollo (Cambridge)

Online Research Database In Technology

ScholarBank@NUS

RNAscClust:Clustering RNA sequences using structure conservation and graph based motifs

Author: Backofen Rolf
Costa Fabrizio
Gorodkin Jan
Havgaard Jakob Hull
Junge Alexander
Miladi Milad
Seemann Stefan E.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

MotivationClustering RNA sequences with common secondary structure is an essential step towards studying RNA function. Whereas structural RNA alignment strategies typically identify common structure for orthologous structured RNAs, clustering seeks to group paralogous RNAs based on structural similarities. However, existing approaches for clustering paralogous RNAs, do not take the compensatory base pair changes obtained from structure conservation in orthologous sequences into account.ResultsHere, we present RNAscClust, the implementation of a new algorithm to cluster a set of structured RNAs taking their respective structural conservation into account. For a set of multiple structural alignments of RNA sequences, each containing a paralog sequence included in a structural alignment of its orthologs, RNAscClust computes minimum free-energy structures for each sequence using conserved base pairs as prior information for the folding. The paralogs are then clustered using a graph kernel-based strategy, which identifies common structural features. We show that the clustering accuracy clearly benefits from an increasing degree of compensatory base pair changes in the alignments.Availability and ImplementationRNAscClust is available at http://www.bioinf.uni-freiburg.de/Software/RNAscClust

Crossref

FreiDok plus

Copenhagen University Research Information System