Search CORE

16 research outputs found

Dynamics of domain coverage of the protein sequence universe

Author: Peterson Gregory D.
Rekapalli Bhanu
Wuichet Kristin
Zhulin Igor B.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 16/11/2012
Field of study

Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data

University of Tennessee, Knoxville: Trace

Crossref

Springer - Publisher Connector

PubMed Central

Accelerated Profile HMM Searches

Author: A Jacob
A Krogh
A Milosavljević
A Wozniak
AA Schäffer
B Rekapalli
C Camacho
DR Horn
EK Freyhult
EM Gertz
G Chukkapalli
GA Price
J Landman
JP Walters
JP Walters
K Karplus
LR Rabiner
LS Johnson
M Farrar
M Madera
R Durbin
RD Finn
RP Maddimsetty
S Derrien
S Hunter
S Johnson
Sean R. Eddy
SF Altschul
SF Altschul
SF Altschul
SF Altschul
SJ Melnikoff
SR Eddy
T Oliver
T Rognes
T Rognes
TF Smith
V Chaudhary
V Sachdeva
William R. Pearson
WN Grundy
WR Pearson
Y Sun
Y Sun
YK Yu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Most partial domains in proteins are alignment and annotation artifacts

Author: A Nagy
A Prakash
B Rekapalli
Deborah A Triant
DW Russell
I Sillitoe
K Forslund
LJ Mills
M Punta
MW Gonzalez
P Flicek
PW Rose
Q Xu
RD Finn
RD Finn
S Hunter
S Light
T Madej
TC Sudhof
William R Pearson
WR Pearson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text

Author: A Abi-Haidar
A Ceol
A Chatr-aryamontri
A Cohen
A Kolchinsky
A Lourenco
A McCallum
A Ng
A Yeh
Alfonso Valencia
AM Cohen
Andrew Chatr-aryamontri
Andrew Winter
Ashish V Tendulkar
B Aranda
B Settles
BP Suomela
C Blaschke
C Elkan
C Stark
Charles Elkan
D Bauer
D Salgado
David Salgado
E Marcotte
F Ehrler
F Leitner
F Leitner
F Leitner
F Rinaldi
F Rinaldi
F Rinaldi
Fabio Rinaldi
Feifan Liu
Florian Leitner
G Andrew
Gerold Schneider
Gianni Cesareni
GL Poulter
Graciela Gonzalez
H Daumé III
H Hermjakob
H Shatkay
H Wang
Hagit Shatkay
HK Rekapalli
I Donaldson
J Lin
Jean-Fred Fontaine
JR Curran
Keith Noto
KG Dowell
L Tanabe
Leonardo Briganti
Livia Perfetto
Luana Licata
Luis Rocha
Luisa Castagnoli
M Hall
M Harris
M Hollander
M Krallinger
M Krallinger
M Krallinger
M Krallinger
M Krallinger
M Oberoi
Marta Iannuccelli
Martin Krallinger
Miguel A Andrade-Navarro
Miguel Vazquez
Mike Tyers
P Wang
R Chowdhary
R Hoffmann
Rafal Rak
Rezarta Islamaj Dogan
Robert Leaman
S Kim
S Matos
S Orchard
Sergio Matos
Shashank Agarwal
Sun Kim
T Kappeler
T Ono
T Zhang
W Baumgartner
W Hersh
W Hersh
W John Wilbur
W Wilbur
Xinglong Wang
Y Niu
Y Sasaki
Z Cao
Zhiyong Lu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them.RESULTS:A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89 and the best AUC iP/R was 68. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35) the macro-averaged precision ranged between 50 and 80, with a maximum F-Score of 55. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows

HAL AMU

ART

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

ZORA

MDC Repository

HAL: Hyper Article en Ligne

Monash University Research Portal

Portail HAL UA (Université d'Angers)

Role of Natural and Anthropogenic Loadings on Indian Temperature Trends

Author: B. Padmavathi
R. K. Tiwari
Rajesh Rekapalli
Publication venue: Springer Science and Business Media LLC
Publication date: 17/06/2019
Field of study

Crossref

Evidence for Nonlinear Coupling of Solar and ENSO Signals in Indian Temperatures During the Past Century

Author: B. Padmavathi
R. K. Tiwari
Rekapalli Rajesh
Publication venue: Springer Science and Business Media LLC
Publication date: 06/09/2014
Field of study

Crossref

Correction to: Role of Natural and Anthropogenic Loadings on Indian Temperature Trends

Author: B. Padmavathi
R. K. Tiwari
Rajesh Rekapalli
Publication venue: Springer Science and Business Media LLC
Publication date: 01/05/2022
Field of study

Crossref

HSP-HMMER

Author: Bhanu Rekapalli
Christian Halloy
Igor B. Zhulin
Publication venue: Association for Computing Machinery (ACM)
Publication date
Field of study

Crossref

Supplementary Material for: The Effects of Dairy Components on Energy Partitioning and Metabolic Risk in Mice: A Microarray Study

Author: Bruckbauer A. (4119865)
Gouffon J. (4119868)
Rekapalli B. (4119859)
Zemel M.B. (4119862)
Publication venue
Publication date: 20/06/2017
Field of study

Background/Aim: High-calcium diets modulate energy metabolism and suppress inflammatory stress. These effects are primarily mediated by calcium suppression of calcitriol. We have now investigated the effect of additional components in dairy products [branched-chain amino acids (BCAA) and angiotensin-converting enzyme inhibitors (ACEi)] on adipocyte and muscle metabolism in an animal model of diet-induced obesity. Methods: aP2-agouti mice were fed four different 70% restricted diets for 6 weeks: basal-restricted diet (0.4% Ca), nonfat dry milk (1.2% Ca), calcium-depleted milk (0.4% Ca), or basal-restricted diet (0.4% Ca) with supplemented BCAA/ACEi. A high-density oligonucleotide microarray approach was used to compare the effects on energy metabolism. Results: Lipogenic genes in adipose tissue were downregulated in the milk group while in muscle protein synthetic pathways were stimulated by the Ca-depleted and low Ca/BCAA/ACEi diets. Pathways involved in inflammation were altered in adipose tissue and muscle by all three diet treatment groups. Conclusions: The results support our previous findings that calcium and BCAA contribute to the alteration of energy partitioning between adipose tissue and muscle. They provide further evidence for a calcium-independent effect of BCAA and ACEi in energy metabolism and inflammation

The Francis Crick Institute