Search CORE

Network deconvolution as a general method to distinguish direct dependencies in networks

Author: A de la Fuente
A Greenfield
A Hartemink
A Pinna
A Seth
AA Margolin
AC Haury
AJ Butte
BG Giraud
CJ Quinn
CK Hemelrijk
D Altschuh
D di Bernardo
D Jones
D Marbach
D Marbach
Daniel Marbach
DFT Veiga
DJ Reiss
DS Marks
DS Marks
E Neher
F Morcos
J Tang
JJ Faith
KD MacIsaac
L Burger
M Ding
M Ekeberg
M Granovetter
M Weigt
Manolis Kellis
MEJ Newman
MEJ Newman
MEJ Newman
MJ Wainwright
Muriel Médard
N Friedman
N Friedman
N Meinshausen
NE Friedkin
R Bonneau
R De Smet
R Koetter
R Küffner
R Sharan
S Gama-Castro
S Lapedes A
SD Dunn
Soheil Feizi
T Nugent
TA Hopf
U Göbel
VA Huynh-Thu
X Shi
X Song
Z Bar-Joseph
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.National Institutes of Health (U.S.) (grant R01 HG004037)National Institutes of Health (U.S.) (grant HG005639)Swiss National Science Foundation (Fellowship)National Science Foundation (U.S.) (NSF CAREER Award 0644282

Identification of HCV protease inhibitor resistance mutations by selection pressure-based method

Author: A. Skelton
Altschuh
C. Cullen
Chen
E. Xia
Ewing
Fried
Hadziyannis
Herrmann
J. Greene
Kass
Kieffer
Li
Li
Liu
Lohmann
Malcolm
Manns
Neher
Nei
P. Qiu
Prongay
Qiu
R. Ralston
S. Curry
S. Liu
TAREMI
Tong
Tong
V. Sanfiorenzo
Wong
X. Tong
Z. Guo
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

A major challenge to successful antiviral therapy is the emergence of drug-resistant viruses. Recent studies have developed several automated analyses of HIV sequence polymorphism based on calculations of selection pressure (Ka/Ks) to predict drug resistance mutations. Similar resistance analysis programs for HCV inhibitors are not currently available. Taking advantage of the recently available sequence data of patient HCV samples from a Phase II clinical study of protease inhibitor boceprevir, we calculated the selection pressure for all codons in the HCV protease region (amino acid 1–181) to identify potential resistance mutations. The correlation between mutations was also calculated to evaluate linkage between any two mutations. Using this approach, we identified previously known major resistant mutations, including a recently reported mutation V55A. In addition, a novel mutation V158I was identified, and we further confirmed its resistance to boceprevir in protease enzyme and replicon assay. We also extended the approach to analyze potential interactions between individual mutations and identified three pairs of correlated changes. Our data suggests that selection pressure-based analysis and correlation mapping could provide useful tools to analyze large amount of sequencing data from clinical samples and to identify new drug resistance mutations as well as their linkage and correlations

arXiv.org e-Print Archive

Direct-coupling analysis of residue co-evolution captures native contacts across many protein families

Author: A. Bertolino
A. Pagnani
Altschuh
Atchley
B. Lunt
Berman
Burger
C. Sander
Campbell
D. S. Marks
Dima
Eddy
F. Morcos
Go
Gouveia-Oliveira
G bel
Halabi
Herrou
Hoch
J. N. Onuchic
Lashuel
Lee
Lockless
Lunt
M. Weigt
Maris
Miyazawa
Neher
Pai
Pasternak
Procaccini
R. Zecchina
Schug
Shindyalov
Skerker
T. Hwa
Tame
Tanaka
Tillier
White
Wisedchaisri
Wollenberg
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2011
Field of study

The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced Direct Coupling Analysis (DCA) (Weigt et al. (2009) Proc Natl Acad Sci 106:67). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intra- domain residue contacts, arising, e.g., from alternative protein conformations, ligand- mediated residue couplings, and inter-domain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, provided the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.Comment: 28 pages, 7 figures, to appear in PNA

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

HAL-Paris1

HAL: Hyper Article en Ligne

Hal-Diderot

PORTO Publications Open Repository TOrino

Protein 3D Structure Computed from Evolutionary Sequence Variation

Author: A Kryshtafovych
A Roy
A Schug
A Zemla
AA Fodor
AF Poon
AF Poon
Andrea Pagnani
Andrej Sali
AP Kamat
AR Ortiz
AR Ortiz
ASGB Lapedes
AT Brunger
B Reva
BG Giraud
C Chothia
Chris Sander
CS Miller
D Altschuh
D Altschuh
D Cozzetto
DE Kim
DE Shaw
Debora S. Marks
E Neher
E Schneidman
EI Shakhnovich
F Morcos
G Kolesov
H Fehlhammer
HRFB Kappen
IN Shindyalov
J DeBartolo
J Moult
J Moult
J Moult
J Qiu
J Skolnick
JM Duarte
JM Skerker
JS Yang
JW Locasale
KT Simons
L Burger
L Burger
L Holm
Lucy J. Colwell
M Mezard
M Miyano
M Vendruscolo
M Weigt
MMT Mezard
N Halabi
N Siew
P Bradley
P Bradley
P Fariselli
P Joost
PMJW Ravikumar
R Das
R Nair
R Sathyapriya
RD Finn
Riccardo Zecchina
RO Dror
Robert Sheridan
S Raman
S Raman
S Wu
S Wu
S Yooseph
SD Dunn
T Mora
TF Havel
Thomas A. Hopf
TR Lezon
TR Lezon
U Göbel
V Morea
VMR Sessak
WP Russ
WR Atchley
WR Taylor
WR Taylor
Y Duan
Y Zhang
Y Zhang
YJAH Roudi
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

The Francis Crick Institute

PORTO Publications Open Repository TOrino

Inference of Co-Evolving Site Pairs: an Excellent Predictor of Contact Residue Pairs in Protein 3D structures

Author: A Doron-Faigenboim
A Gulyás-Kovács
AA Fodor
AFY Poon
CH Yeang
D Altschuh
DD Pollock
DD Pollock
DS Marks
F Morcos
FM Richards
G Bazykin
IN Shindyalov
J Dutheil
J Dutheil
J Dutheil
J Felsenstein
J Romiguier
J Tsai
JD ÓBrien
JM Duarte
JM Skerker
JS Yang
K Lie
KT Simons
L Burger
L Burger
LC Martin
M Fares
M Go
M Punta
M Vassura
M Weigt
Marc Robinson-Rechavi
MN Price
MN Price
N Halabi
O Penn
P Bradley
P Fariselli
P Tataru
P Tufféry
PY Chou
R Grantham
R Nielsen
R Sathyapriya
S Guindon
S Maisnier-Patin
S Miyazawa
S Miyazawa
S Miyazawa
S Wu
Sanzo Miyazawa
SD Dunn
SJ Fleishman
SQ Le
SW Lockless
U Göbel
VN Minin
VN Minin
WM Fitch
WP Russ
WR Atchley
WR Taylor
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/08/2012
Field of study

Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing indirect dependences through other residues within a protein or even through other molecules. Recent attempts of disentangling direct from indirect dependences of amino acid types between residue positions in multiple sequence alignments have revealed that the strength of inferred residue pair couplings is an excellent predictor of residue-residue proximity in folded structures. Here, we report an alternative attempt of inferring co-evolving site pairs from concurrent and compensatory substitutions between sites in each branch of a phylogenetic tree. First, branch lengths of a phylogenetic tree inferred by the neighbor-joining method are optimized as well as other parameters by maximizing a likelihood of the tree in a mechanistic codon substitution model. Mean changes of quantities, which are characteristic of concurrent and compensatory substitutions, accompanied by substitutions at each site in each branch of the tree are estimated with the likelihood of each substitution. Partial correlation coefficients of the characteristic changes along branches between sites are calculated and used to rank co-evolving site pairs. Accuracy of contact prediction based on the present co-evolution score is comparable to that achieved by a maximum entropy model of protein sequences for 15 protein families taken from the Pfam release 26.0. Besides, this excellent accuracy indicates that compensatory substitutions are significant in protein evolution.Comment: 17 pages, 4 figures, and 4 tables with supplementary information of 5 figure

arXiv.org e-Print Archive

The Francis Crick Institute

H2r: Identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments

Author: A del Sol Mesa
AL Barabási
B Rost
C Notredame
C Ouzounis
C Sander
C Steegborn
CC Hyde
CE Shannon
D Altschuh
DR Caffrey
E Eyal
E Neher
E Weber-Ban
E Zuckerkandl
ER Tillier
F Pearl
GB Gloor
GM Süel
HO Villar
I Kass
IM Wallace
J Tsai
JA Capra
JP Dekker
K Katoh
K Wang
LA Kelley
LC Martin
M Landau
Matthias Zwick
MC Saraf
ME Noble
O Noivirt
O Olmea
OV Kalinina
OV Kalinina
R Merkl
RA Estabrook
RA Laskowski
Rainer Merkl
RD Finn
RI Dima
S Henikoff
SJ Fleishman
SM Larson
SW Lockless
T Lassmann
T Sato
TD Schneider
U Göbel
V Kulik
V Kulik
WH Press
WR Atchley
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: A multiple sequence alignment (MSA) generated for a protein can be used to characterise residues by means of a statistical analysis of single columns. In addition to the examination of individual positions, the investigation of co-variation of amino acid frequencies offers insights into function and evolution of the protein and residues. RESULTS: We introduce conn(k), a novel parameter for the characterisation of individual residues. For each residue k, conn(k) is the number of most extreme signals of co-evolution. These signals were deduced from a normalised mutual information (MI) value U(k, l) computed for all pairs of residues k, l. We demonstrate that conn(k) is a more robust indicator than an individual MI-value for the prediction of residues most plausibly important for the evolution of a protein. This proposition was inferred by means of statistical methods. It was further confirmed by the analysis of several proteins. A server, which computes conn(k)-values is available at http://www-bioinf.uni-regensburg.de. CONCLUSION: The algorithms H2r, which analyses MSAs and computes conn(k)-values, characterises a specific class of residues. In contrast to strictly conserved ones, these residues possess some flexibility in the composition of side chains. However, their allocation is sensibly balanced with several other positions, as indicated by conn(k)

University of Regensburg Publication Server

Springer - Publisher Connector

A Possible Role for Metallic Ions in the Carbohydrate Cluster Recognition Displayed by a Lewis Y Specific Antibody

Author: A Basu
A Vagin
AC Esqueda
AM Edstrom
AM Scott
AM Scott
AM Scott
AM Scott
Andrew M. Scott
AT Brunger
B Hemmens
B Morrison
BJ Kim
Bostjan Kobe
BW Yin
D Altschuh
DA Calarese
DE Koppel
DR Bundle
DS Heller
E van Liempt
E Yuriev
H Farhan
H Haase
J Sakamoto
JJ Candelier
JJ Garcia-Vallejo
JM Rini
JN Herron
K Kitamura
K Murata
LH Pai
LH Pai-Scherf
LM Krug
M Folin
M Klinger
M Reynolds
M Sundstrom
MF Dunn
MN Saleh
MP Kelly
OH Szolar
PA Ramsland
PA Ramsland
Paul A. Ramsland
PC Pang
R Mollicone
RA Laskowski
RL Stanfield
RM Perera
S Bailey
S Chacko
S Chalabi
TG Johns
TM Moehler
U Schulze-Gahmen
V Minas
V Vallas
William Farrugia
Y Cao
YH Ding
YS Kim
Z Otwinowski
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

BACKGROUND:Lewis Y (Le(y)) is a blood group-related carbohydrate that is expressed at high surface densities on the majority of epithelial carcinomas and is a promising target for antibody-based immunotherapy. A humanized Le(y)-specific antibody (hu3S193) has shown encouraging safety, pharmacokinetic and tumor-targeting properties in recently completed Phase I clinical trials. METHODOLOGY/PRINCIPAL FINDINGS:We report the three-dimensional structures for both the free (unliganded) and bound (Le(y) tetrasaccharide) hu3S193 Fab from the same crystal grown in the presence of divalent zinc ions. There is no evidence of significant conformational changes occurring in either the Le(y) carbohydrate antigen or the hu3S193 binding site, which suggests a rigid fit binding mechanism. In the crystal, the hu3S193 Fab molecules are coordinated at their protein-protein interface by two zinc ions and in solution aggregation of Fab can be initiated by zinc, but not magnesium ions. Dynamic light scattering revealed that zinc ions could initiate a sharp transition from hu3S193 Fab monomers to large multimeric aggregates in solution. CONCLUSIONS/SIGNIFICANCE:Zinc ions can mediate interactions between hu3S193 Fab in crystals and in solution. Whether metallic ion mediated aggregation of antibody occurs in vivo is not known, but the present results suggest that similar clustering mechanisms could occur when hu3S193 binds to Le(y) on cells, particularly given the high surface densities of antigen on the target tumor cells

University of Melbourne Institutional Repository

espace@Curtin

Conserved and variable correlated mutations in the plant MADS protein network

Author: A Bairoch
A Becker
A Fuchs
A Lupas
A Sali
AA Fodor
Aalt DJ van Dijk
AD Han
ADJ van Dijk
AH Paterson
AK Ramani
AS Veron
AT Brunger
BA Krizek
C Espinosa-soto
CM Buslje
CS Goh
CS Miller
D Altschuh
D Juan
DA Afonnikov
DS Horner
E Santelli
EA Merritt
F Fornara
F Pazos
F Pazos
F Pazos
G Angenent
GA Tuskan
H Ashkenazy
HB Fraser
HY Shan
HY Shan
HY Yu
I Halperin
J Lim
J Sundstrom
JD Thompson
JG Caporaso
JL Riechmann
JMG Izarzugaza
K Hill
K Huang
K Kaufmann
K Kaufmann
L Hakes
L Mendoza
L Parenicova
L Pellegrini
LC Martin
LJ Cseke
LP Martinez-Castilla
M Hassler
M Ng
M Socolich
MA Fares
MJ Buck
N Shitsukawa
NA Kane
NJ Mulder
O Noivirt
PJ Kraulis
PJ Waddell
R Melzer
R Ming
R Velasco
RC Edgar
RGH Immink
RKP Kuipers
RM Clark
Roeland CHJ van Ham
S Ciannamea
S De Bodt
S de Folter
S Henikoff
S Mika
SA Goff
SA Rensing
SAA Travers
SAA Travers
SR Eddy
T Hernandez-Hernandez
T Sato
Y Mo
YZ Yang
YZ Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Plant MADS domain proteins are involved in a variety of developmental processes for which their ability to form various interactions is a key requisite. However, not much is known about the structure of these proteins or their complexes, whereas such knowledge would be valuable for a better understanding of their function. Here, we analyze those proteins and the complexes they form using a correlated mutation approach in combination with available structural, bioinformatics and experimental data. Results Correlated mutations are affected by several types of noise, which is difficult to disentangle from the real signal. In our analysis of the MADS domain proteins, we apply for the first time a correlated mutation analysis to a family of interacting proteins. This provides a unique way to investigate the amount of signal that is present in correlated mutations because it allows direct comparison of mutations in various family members and assessing their conservation. We show that correlated mutations in general are conserved within the various family members, and if not, the variability at the respective positions is less in the proteins in which the correlated mutation does not occur. Also, intermolecular correlated mutation signals for interacting pairs of proteins display clear overlap with other bioinformatics data, which is not the case for non-interacting protein pairs, an observation which validates the intermolecular correlated mutations. Having validated the correlated mutation results, we apply them to infer the structural organization of the MADS domain proteins. Conclusion Our analysis enables understanding of the structural organization of the MADS domain proteins, including support for predicted helices based on correlated mutation patterns, and evidence for a specific interaction site in those proteins.</p

Springer - Publisher Connector

Wageningen University & Research Publications

Computing Highly Correlated Positions Using Mutual Information and Graph Theory for G Protein-Coupled Receptors

Author: A Pagano
AA Ivanov
AK Ramani
AR Ortiz
B Galitsky
BTM Korber
C Goh
C Hemmerich
C Yeang
Carson C. Chow
CD Strader
CE Shannon
CJ Harris
CS Sum
D Altschuh
DD Pollock
DD Pollock
DKY Chiu
DM Rosenbaum
E Neher
F Horn
F Knoflach
F Pazos
F Pazos
FY Carroll
G Casari
G Kleinau
G Kleinau
G Suel
G Swaminath
GB Gloor
H Herzel
H Jaschke
I Halperin
I Kass
IG Tikhonova
IN Shindyalov
J Dutheil
J Kim
J Thomas
JA Ballesteros
JA Ballesteros
JA Capra
JE Donald
JE Donald
JE Donald
JS Surgand
JW Kelly
JX Hu
K Palczewski
K Ray
K Ray
K Sjolander
K Ye
K Ye
KD Pruitt
KL Pierce
KY Yip
L Lewyn
L Oliveira
L Oliveira
L Oliveira
L Pritchard
LA Mirny
LC Martin
LH Heitman
M Raviscioni
M Scarselli
M Socolich
MA Hanson
Matthieu Louis
ME Olah
MJ Buck
ML Lopez-Rodriguez
MS Roulston
MW Dimmic
ND Clarke
NG Hoffman
O Lichtarge
O Lichtarge
O Noivirt
OF Lange
OV Kalinina
PJ Kundrotas
PR Gouldson
R Banerjee
R Brun
R Fredriksson
R Jothi
R Steuer
RI Dima
RM Williamson
RR Gutell
S Chakrabarti
S Costanzi
S Costanzi
S Costanzi
S Costanzi
S Govindarajan
S Litschig
S Madabushi
S Moore
S Moro
S Ohno
S Takeda
Sarosh N. Fatakia
SB Nagl
SB Nagl
SD Dunn
SGF Rasmussen
SJ Fleishman
SS Hannenhalli
Stefano Costanzi
SW Lockless
T Klabunde
T Klabunde
T Sato
T Warne
TD Schneider
TM Cover
V Batageli
V Cherezov
VP Jaakola
WP Russ
WR Atchley
WR Atchley
WR Taylor
Y Liu
Y Qi
Publication venue: Public Library of Science
Publication date: 05/03/2009
Field of study

G protein-coupled receptors (GPCRs) are a superfamily of seven transmembrane-spanning proteins involved in a wide array of physiological functions and are the most common targets of pharmaceuticals. This study aims to identify a cohort or clique of positions that share high mutual information. Using a multiple sequence alignment of the transmembrane (TM) domains, we calculated the mutual information between all inter-TM pairs of aligned positions and ranked the pairs by mutual information. A mutual information graph was constructed with vertices that corresponded to TM positions and edges between vertices were drawn if the mutual information exceeded a threshold of statistical significance. Positions with high degree (i.e. had significant mutual information with a large number of other positions) were found to line a well defined inter-TM ligand binding cavity for class A as well as class C GPCRs. Although the natural ligands of class C receptors bind to their extracellular N-terminal domains, the possibility of modulating their activity through ligands that bind to their helical bundle has been reported. Such positions were not found for class B GPCRs, in agreement with the observation that there are not known ligands that bind within their TM helical bundle. All identified key positions formed a clique within the MI graph of interest. For a subset of class A receptors we also considered the alignment of a portion of the second extracellular loop, and found that the two positions adjacent to the conserved Cys that bridges the loop with the TM3 qualified as key positions. Our algorithm may be useful for localizing topologically conserved regions in other protein families