Search CORE

arXiv.org e-Print Archive

CERN Document Server

Potentials of Mean Force for Protein Structure Prediction Vindicated, Formalized and Generalized

Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge based potentials based on pairwise distances -- so-called "potentials of mean force" (PMFs) -- have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state -- a necessary component of these potentials -- is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities reference ratio distributions deriving from the application of the reference ratio method. This new view is not only of theoretical relevance, but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures

Public Library of Science (PLOS)

Copenhagen University Research Information System

Knowledge-based energy functions for computational studies of proteins

Author: A. Ben-Naim
A. Godzik
A. Godzik
A. Rossi
A.J. Bordner
A.V. Finkelstein
B. Fain
B. Krishnamoorthy
B. Kuhlman
B. Schölkopf
B.H. Park
B.I. Dahiyat
B.J. McConkey
B.O. Mitchell
C. Anfinsen
C. Carter Jr.
C. Czaplewski
C. Hoppe
C. Hu
C. Micheletti
C. Papadimitriou
C. Zhang
C. Zhang
C. Zhang
C. Zhang
C. Zhang
C.A. Rohl
C.B. Anfinsen
C.M.R Lemer
C.S. Mészáros
D. Gilis
D. Gilis
D. Gilis
D. Tobi
D. Xu
E. Venclovas
E.I. Shakhnovich
E.I. Shakhnovich
F.A. Momany
H. Dobbs
H. Edelsbrunner
H. Gan
H. Li
H. Li
H. Lu
H. Zhou
H.S. Chan
I. Muegge
J. Khatun
J. Liang
J.A. Kocher
J.A. Rank
J.M. Deutsch
J.R. Bienkowska
K. Nishikawa
K. Sale
K.H. Lee
K.K. Koretke
K.K. Koretke
K.T. Simons
L. Adamian
L. Adamian
L. Adamian
L.A. Mirny
L.L. Looger
L.M. Amzel
M. Karplus
M. Levitt
M. Vendruscolo
M. Vendruscolo
M.H. Hao
M.H. Hao
M.J. Sippl
M.J. Sippl
M.J. Sippl
M.P. Eastwood
M.R. Betancourt
M.S. Friedrichs
N. Karmarkar
N.V. Buchete
N.V. Buchete
P. Koehl
P. Koehl
P.D. Thomas
P.D. Thomas
P.G. Wolynes
P.J. Munson
R. Goldstein
R. Guerois
R. Jackups Jr.
R. Janicke
R. Méndez
R. Samudrala
R. Samudrala
R.B. Hill
R.I. Dima
R.J. Vanderbei
R.K. Singh
R.L. Jernigan
R.S. DeWitte
S. Liu
S. Miyazawa
S. Miyazawa
S. Miyazawa
S. Shimizu
S. Shimizu
S. Tanaka
S.J. Wodak
T. Kortemme
T. Kortemme
T. Kortemme
T. Lazaridis
T.L. Chiu
U. Bastolla
U. Bastolla
V. Vapnik
V. Vapnik
V.N. Maiorov
W.P. Russ
X. Li
X. Li
Y. Duan
Y. Park
Y. Xia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/01/2006
Field of study

This chapter discusses theoretical framework and methods for developing knowledge-based potential functions essential for protein structure prediction, protein-protein interaction, and protein sequence design. We discuss in some details about the Miyazawa-Jernigan contact statistical potential, distance-dependent statistical potentials, as well as geometric statistical potentials. We also describe a geometric model for developing both linear and non-linear potential functions by optimization. Applications of knowledge-based potential functions in protein-decoy discrimination, in protein-protein interactions, and in protein design are then described. Several issues of knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe

arXiv.org e-Print Archive

Comprehensive computational analysis of Hmd enzymes and paralogs in methanogenic Archaea

Author: Goldman Aaron D
Leigh John A
Samudrala Ram
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Methanogenesis is the sole means of energy production in methanogenic Archaea. H2-forming methylenetetrahydromethanopterin dehydrogenase (Hmd) catalyzes a step in the hydrogenotrophic methanogenesis pathway in class I methanogens. At least one <it>hmd </it>paralog has been identified in nine of the eleven complete genome sequences of class I hydrogenotrophic methanogens. The products of these paralog genes have thus far eluded any detailed functional characterization. Results Here we present a thorough computational analysis of Hmd enzymes and paralogs that includes state of the art phylogenetic inference, structure prediction, and functional site prediction techniques. We determine that the Hmd enzymes are phylogenetically distinct from Hmd paralogs but share a common overall structure. We predict that the active site of the Hmd enzyme is conserved as a functional site in Hmd paralogs and use this observation to propose possible molecular functions of the paralog that are consistent with previous experimental evidence. We also identify an uncharacterized site in the N-terminal domains of both proteins that is predicted by our methods to directly impart function. Conclusion This study contributes to our understanding of the evolutionary history, structural conservation, and functional roles, of the Hmd enzymes and paralogs. The results of our phylogenetic and structural analysis constitute datasets that will aid in the future study of the Hmd protein family. Our functional site predictions generate several testable hypotheses that will guide further experimental characterization of the Hmd paralog. This work also represents a novel approach to protein function prediction in which multiple computational methods are integrated to achieve a detailed characterization of proteins that are not well understood.</p

Springer - Publisher Connector

The evolution and functional repertoire of translation proteins following the origin of life

Author: Baross John A
Goldman Aaron D
Samudrala Ram
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The RNA world hypothesis posits that the earliest genetic system consisted of informational RNA molecules that directed the synthesis of modestly functional RNA molecules. Further evidence suggests that it was within this RNA-based genetic system that life developed the ability to synthesize proteins by translating genetic code. Here we investigate the early development of the translation system through an evolutionary survey of protein architectures associated with modern translation. Results Our analysis reveals a structural expansion of translation proteins immediately following the RNA world and well before the establishment of the DNA genome. Subsequent functional annotation shows that representatives of the ten most ancestral protein architectures are responsible for all of the core protein functions found in modern translation. Conclusions We propose that this early robust translation system evolved by virtue of a positive feedback cycle in which the system was able to create increasingly complex proteins to further enhance its own function. Reviewers This article was reviewed by Janet Siefert, George Fox, and Antonio Lazcano (nominated by Laura Landweber)</p

Neuer Kopf, alte Ideen? : "Normalisierung" des Front National unter Marine Le Pen

Author: Andricioaei I.
Balog E.
Balog E.
Baron R.
Baron R.
Benedix A.
Berezovsky I. N.
Bongini L.
Brady G. P.
Chang C.-E.
Chang C.-E.
Chang C.-E.
Chen J.
Chou K.
de Groot B. L.
De Mori G.
De Mori G.
Doig A. J.
Edholm O.
Feig M.
Fogolari F.
Frauenfelder H.
Frauenfelder H.
Frishman D.
Gupta V.
Hensen U.
Hensen U.
Hess B.
Hnizdo V.
Hnizdo V.
Ignacio Fita
Ismer L.
J. Miguel Rubi
Karplus M.
Karplus M.
Kellogg E. H.
Killian B. J.
Killian B. J.
King B. M.
King B. M.
Kuzmanic A.
Lazaridis T.
Li D.-W.
Ma B.
Makhatadze G. I.
Martin Goethe
McCammon J. A.
Meirovitch H.
Numata J.
Numata J.
Ohkubo Y. Z.
Press W. H.
Rossi M.
Samudrala R.
Schlitter J.
Schäfer H.
Schäfer J.
Sciretti D.
Scott W. R. P.
Suárez E.
Theobald D. L.
Tidor B.
Wand A. J.
Williamson J. R.
Zhang J.
Zhou H.-X.
Publication venue: Wissenschaftliche Einrichtungen. DFI - Deutsch Französisches Institut
Publication date: 01/01/2014
Field of study

In this article, it is investigated whether vibrational entropy (VE) is an important contribution to the free energy of globular proteins at ambient conditions. VE represents the major configurational-entropy contribution of these proteins. By definition, it is an average of the configurational entropies of the protein within single minima of the energy landscape, weighted by their occupation probabilities. Its large part originates from thermal motion of flexible torsion angles giving rise to the finite peak widths observed in torsion angle distributions. While VE may affect the equilibrium properties of proteins, it is usually neglected in numerical calculations as its consideration is difficult. Moreover, it is sometimes believed that all well-packed conformations of a globular protein have similar VE anyway. Here, we measure explicitly the VE for six different conformations from simulation data of a test protein. Estimates are obtained using the quasi-harmonic approximation for three coordinate sets, Cartesian, bond-angle-torsion (BAT), and a new set termed rotamer-degeneracy lifted BAT coordinates by us. The new set gives improved estimates as it overcomes a known shortcoming of the quasi-harmonic approximation caused by multiply populated rotamer states, and it may serve for VE estimation of macromolecules in a very general context. The obtained VE values depend considerably on the type of coordinates used. However, for all coordinate sets we find large entropy differences between the conformations, of the order of the overall stability of the protein. This result may have important implications on the choice of free energy expressions used in software for protein structure prediction, protein design, and NMR refinement

Secretaría de Estado de Cultura

eDoc.VifaPol

Digital.CSIC

The Francis Crick Institute

An Estimate of the Numbers and Density of Low-Energy Structures (or Decoys) in the Conformational Landscape of Proteins

Author: AK Felts
AR Dinner
B Park
BR Brooks
C Hardin
C Keasar
CA Floudas
CL Liu
D Gilis
D Petrey
DJ Finney
ES Huang
F Melo
G Nemethy
Gautham Namasivayam
HA Scheraga
HM Berman
J Tsai
JD Bryngelson
JD Bryngelson
JF Griffin
JN Onuchic
JS Weissman
K Vengadesan
K Vengadesan
K Vengadesan
KA Dill
KA Olszewski
Kanagasabai Vadivel
KT Simons
L Holm
LA Mirny
Laurent Kreplak
MJ Sippl
MR Betancourt
O Almog
P Heikinheimo
P Koehl
PG Wolynes
R Bonneau
R Samudrala
R Samudrala
R Samudrala
R Samudrala
R Samudrala
RA Goldbeck
SJ Weiner
SS Plotkin
SS Plotkin
T Herges
T Lazaridis
TR Sosnick
V Kanagasabai
Y Levy
Y Wang
Z Li
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

The conformational energy landscape of a protein, as calculated by known potential energy functions, has several minima, and one of these corresponds to its native structure. It is however difficult to comprehensively estimate the actual numbers of low energy structures (or decoys), the relationships between them, and how the numbers scale with the size of the protein.We have developed an algorithm to rapidly and efficiently identify the low energy conformers of oligo peptides by using mutually orthogonal Latin squares to sample the potential energy hyper surface. Using this algorithm, and the ECEPP/3 potential function, we have made an exhaustive enumeration of the low-energy structures of peptides of different lengths, and have extrapolated these results to larger polypeptides.We show that the number of native-like structures for a polypeptide is, in general, an exponential function of its sequence length. The density of these structures in conformational space remains more or less constant and all the increase appears to come from an expansion in the volume of the space. These results are consistent with earlier reports that were based on other models and techniques

CiteSeerX

Public Library of Science (PLOS)

Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction

Author: A Zemla
AG Murzin
B Park
B Rost
B Wallner
BA Reva
BH Park
C Keasar
CH Wu
Ching-Wai Tan
CS Pettitt
D Eramian
D Shortle
David T Jones
DT Jones
DT Jones
DT Jones
J Moult
J Tsai
KT Simons
LJ McGuffin
M Fasnacht
M Wiederstein
MI Sadowski
MJ Sippl
N Siew
R Samudrala
R Samudrala
SCE Tosatto
SF Altschul
W Kabsch
Y Xia
Y Zhang
Y Zhang
Publication venue: BioMed Central
Publication date: 01/02/2008
Field of study

Background: We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.Results: Results have shown that the best performing neural network is the one that uses input information comprising of PSI-BLAST [1] profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested in discriminating the native structure from a set of decoys for all decoy datasets tested. Conclusion: This method is demonstrated to be viable, and furthermore evolutionary information is successfully used in the neural networks to improve decoy discrimination

UCL Discovery

LoCo: a novel main chain scoring function for protein structure prediction based on local coordinates

Author: A Godzik
A Mittal
A Mittal
A Mukherjee
AM Poole
B John
B Park
B Robson
C Levinthal
D Rykunov
D Tobi
FE Boas
G Casari
H Zhou
I Bahar
I Georgiev
J Skolnick
J Skolnick
JC Kendrew
KA Dill
KT Simons
KT Simons
L Pauling
L Pauling
L Pauling
L Pauling
L Pauling
L Pauling
L Pauling
L Pauling
L Pauling
LA Mirny
M Boniecki
M Safi
M Vendruscolo
MJ Sippl
MJ Sippl
MR Betancourt
MY Shen
N-V Buchete
PD Thomas
R Das
R Samudrala
R Samudrala
Ram Samudrala
S Miyazawa
S Miyazawa
S Miyazawa
S Tanaka
S Vajda
SH Bryant
Stewart E Moughon
T Head-Gordon
U Bastolla
V Tozzini
Y Feng
Y Makino
Y Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector