226 research outputs found
Identification of direct residue contacts in protein-protein interaction by message passing
Understanding the molecular determinants of specificity in protein-protein
interaction is an outstanding challenge of postgenome biology. The availability
of large protein databases generated from sequences of hundreds of bacterial
genomes enables various statistical approaches to this problem. In this context
covariance-based methods have been used to identify correlation between amino
acid positions in interacting proteins. However, these methods have an
important shortcoming, in that they cannot distinguish between directly and
indirectly correlated residues. We developed a method that combines covariance
analysis with global inference analysis, adopted from use in statistical
physics. Applied to a set of >2,500 representatives of the bacterial
two-component signal transduction system, the combination of covariance with
global inference successfully and robustly identified residue pairs that are
proximal in space without resorting to ad hoc tuning parameters, both for
heterointeractions between sensor kinase (SK) and response regulator (RR)
proteins and for homointeractions between RR proteins. The spectacular success
of this approach illustrates the effectiveness of the global inference approach
in identifying direct interaction based on sequence information alone. We
expect this method to be applicable soon to interaction surfaces between
proteins present in only 1 copy per genome as the number of sequenced genomes
continues to expand. Use of this method could significantly increase the
potential targets for therapeutic intervention, shed light on the mechanism of
protein-protein interaction, and establish the foundation for the accurate
prediction of interacting protein partners.Comment: Supplementary information available on
http://www.pnas.org/content/106/1/67.abstrac
Evolutionary analysis of mitochondrially encoded proteins of toad-headed lizards, Phrynocephalus, along an altitudinal gradient.
BACKGROUND: Animals living at high altitude must adapt to environments with hypoxia and low temperatures, but relatively little is known about underlying genetic changes. Toad-headed lizards of the genus Phrynocephalus cover a broad altitudinal gradient of over 4000 m and are useful models for studies of such adaptive responses. In one of the first studies to have considered selection on mitochondrial protein-coding regions in an ectothermic group distributed over such a wide range of environments, we analysed nineteen complete mitochondrial genomes from all Chinese Phrynocephalus (including eight genomes sequenced for the first time). Initial analyses used site and branch-site model (program: PAML) approaches to examine nonsynonymous: synonymous substitution rates across the mtDNA tree. RESULTS: Ten positively selected sites were discovered, nine of which corresponded to subunits ND2, ND3, ND4, ND5, and ND6 within the respiratory chain enzyme mitochondrial Complex I (NADH Coenzyme Q oxidoreductase). Four of these sites showed evidence of general long-term selection across the group while the remainder showed evidence of episodic selection across different branches of the tree. Some of these branches corresponded to increases in altitude and/or latitude. Analyses of physicochemical changes in protein structures revealed that residue changes at sites that were under selection corresponded to major functional differences. Analyses of coevolution point to coevolution of selected sites within the ND4 subunit, with key sites associated with proton translocation across the mitochondrial membrane. CONCLUSIONS: Our results identify mitochondrial Complex I as a target for environment-mediated selection in this group of lizards, a complex that frequently appears to be under selection in other organisms. This makes these lizards good candidates for more detailed future studies of molecular evolution
Direct knock-on of desolvated ions governs strict ion selectivity in K+ channels
The seeming contradiction that K+ channels conduct K+ ions at maximal throughput rates while not permeating slightly smaller Na+ ions has perplexed scientists for decades. Although numerous models have addressed selective permeation in K+ channels, the combination of conduction efficiency and ion selectivity has not yet been linked through a unified functional model. Here, we investigate the mechanism of ion selectivity through atomistic simulations totalling more than 400 μs in length, which include over 7,000 permeation events. Together with free-energy calculations, our simulations show that both rapid permeation of K+ and ion selectivity are ultimately based on a single principle: the direct knock-on of completely desolvated ions in the channels' selectivity filter. Herein, the strong interactions between multiple 'naked' ions in the four filter binding sites give rise to a natural exclusion of any competing ions. Our results are in excellent agreement with experimental selectivity data, measured ion interaction energies and recent two-dimensional infrared spectra of filter ion configurations
Emergence of terpene cyclization in Artemisia annua
The emergence of terpene cyclization was critical to the evolutionary expansion of chemical diversity yet remains unexplored. Here we report the first discovery of an epistatic network of residues that controls the onset of terpene cyclization in Artemisia annua. We begin with amorpha-4,11-diene synthase (ADS) and (E)-b-farnesene synthase (BFS), a pair of terpene synthases that produce cyclic or linear terpenes, respectively. A library of B27,000 enzymes is generated by breeding combinations of natural amino-acid substitutions from the cyclic into the linear producer. We discover one dominant mutation is sufficient to activate cyclization, and together with two additional residues comprise a network of strongly epistatic interactions that activate, suppress or reactivate cyclization. Remarkably, this epistatic network of equivalent residues also controls cyclization in a BFS homologue from Citrus junos. Fitness landscape analysis of mutational trajectories provides quantitative insights into a major epoch in specialized metabolism
TrkA Undergoes a Tetramer-to-Dimer Conversion to Open TrkH Which Enables Changes in Membrane Potential
TrkH is a bacterial ion channel implicated in K+ uptake and pH regulation. TrkH assembles with its regulatory protein, TrkA, which closes the channel when bound to ADP and opens it when bound to ATP. However, it is unknown how nucleotides control the gating of TrkH through TrkA. Here we report the structures of the TrkH-TrkA complex in the presence of ADP or ATP. TrkA forms a tetrameric ring when bound to ADP and constrains TrkH to a closed conformation. The TrkA ring splits into two TrkA dimers in the presence of ATP and releases the constraints on TrkH, resulting in an open channel conformation. Functional studies show that both the tetramer-to-dimer conversion of TrkA and the loss of constraints on TrkH are required for channel gating. In addition, deletion of TrkA in Escherichia coli depolarizes the cell, suggesting that the TrkH-TrkA complex couples changes in intracellular nucleotides to membrane potential
Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives
BACKGROUND: Predicting residues' contacts using primary amino acid sequence alone is an important task that can guide 3D structure modeling and can verify the quality of the predicted 3D structures. The correlated mutations (CM) method serves as the most promising approach and it has been used to predict amino acids pairs that are distant in the primary sequence but form contacts in the native 3D structure of homologous proteins. RESULTS: Here we report a new implementation of the CM method with an added set of selection rules (filters). The parameters of the algorithm were optimized against fifteen high resolution crystal structures with optimization criterion that maximized the confidentiality of the predictions. The optimization resulted in a true positive ratio (TPR) of 0.08 for the CM without filters and a TPR of 0.14 for the CM with filters. The protocol was further benchmarked against 65 high resolution structures that were not included in the optimization test. The benchmarking resulted in a TPR of 0.07 for the CM without filters and to a TPR of 0.09 for the CM with filters. CONCLUSION: Thus, the inclusion of selection rules resulted to an overall improvement of 30%. In addition, the pair-wise comparison of TPR for each protein without and with filters resulted in an average improvement of 1.7. The methodology was implemented into a web server that is freely available to the public. The purpose of this implementation is to provide the 3D structure predictors with a tool that can help with ranking alternative models by satisfying the largest number of predicted contacts, as well as it can provide a confidence score for contacts in cases where structure is known
Pairwise maximum entropy models for studying large biological systems: when they can and when they can't work
One of the most critical problems we face in the study of biological systems
is building accurate statistical descriptions of them. This problem has been
particularly challenging because biological systems typically contain large
numbers of interacting elements, which precludes the use of standard brute
force approaches. Recently, though, several groups have reported that there may
be an alternate strategy. The reports show that reliable statistical models can
be built without knowledge of all the interactions in a system; instead,
pairwise interactions can suffice. These findings, however, are based on the
analysis of small subsystems. Here we ask whether the observations will
generalize to systems of realistic size, that is, whether pairwise models will
provide reliable descriptions of true biological systems. Our results show
that, in most cases, they will not. The reason is that there is a crossover in
the predictive power of pairwise models: If the size of the subsystem is below
the crossover point, then the results have no predictive power for large
systems. If the size is above the crossover point, the results do have
predictive power. This work thus provides a general framework for determining
the extent to which pairwise models can be used to predict the behavior of
whole biological systems. Applied to neural data, the size of most systems
studied so far is below the crossover point
Disentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments
Predicting protein structure from primary sequence is one of the ultimate challenges in computational biology. Given the large amount of available sequence data, the analysis of co-evolution, i.e., statistical dependency, between columns in multiple alignments of protein domain sequences remains one of the most promising avenues for predicting residues that are contacting in the structure. A key impediment to this approach is that strong statistical dependencies are also observed for many residue pairs that are distal in the structure. Using a comprehensive analysis of protein domains with available three-dimensional structures we show that co-evolving contacts very commonly form chains that percolate through the protein structure, inducing indirect statistical dependencies between many distal pairs of residues. We characterize the distributions of length and spatial distance traveled by these co-evolving contact chains and show that they explain a large fraction of observed statistical dependencies between structurally distal pairs. We adapt a recently developed Bayesian network model into a rigorous procedure for disentangling direct from indirect statistical dependencies, and we demonstrate that this method not only successfully accomplishes this task, but also allows contacts with weak statistical dependency to be detected. To illustrate how additional information can be incorporated into our method, we incorporate a phylogenetic correction, and we develop an informative prior that takes into account that the probability for a pair of residues to contact depends strongly on their primary-sequence distance and the amount of conservation that the corresponding columns in the multiple alignment exhibit. We show that our model including these extensions dramatically improves the accuracy of contact prediction from multiple sequence alignments
A novel synthesis and detection method for cap-associated adenosine modifications in mouse mRNA
A method is described for the detection of certain nucleotide modifications adjacent to the 5' 7-methyl guanosine cap of mRNAs from individual genes. The method quantitatively measures the relative abundance of 2'-O-methyl and N6,2'-O-dimethyladenosine, two of the most common modifications. In order to identify and quantitatify the amounts of N6,2'-O-dimethyladenosine, a novel method for the synthesis of modified adenosine phosphoramidites was developed. This method is a one step synthesis and the product can directly be used for the production of N6,2'-O-dimethyladenosine containing RNA oligonucleotides. The nature of the cap-adjacent nucleotides were shown to be characteristic for mRNAs from individual genes transcribed in liver and testis
Direct-coupling analysis of residue co-evolution captures native contacts across many protein families
The similarity in the three-dimensional structures of homologous proteins
imposes strong constraints on their sequence variability. It has long been
suggested that the resulting correlations among amino acid compositions at
different sequence positions can be exploited to infer spatial contacts within
the tertiary protein structure. Crucial to this inference is the ability to
disentangle direct and indirect correlations, as accomplished by the recently
introduced Direct Coupling Analysis (DCA) (Weigt et al. (2009) Proc Natl Acad
Sci 106:67). Here we develop a computationally efficient implementation of DCA,
which allows us to evaluate the accuracy of contact prediction by DCA for a
large number of protein domains, based purely on sequence information. DCA is
shown to yield a large number of correctly predicted contacts, recapitulating
the global structure of the contact map for the majority of the protein domains
examined. Furthermore, our analysis captures clear signals beyond intra- domain
residue contacts, arising, e.g., from alternative protein conformations,
ligand- mediated residue couplings, and inter-domain interactions in protein
oligomers. Our findings suggest that contacts predicted by DCA can be used as a
reliable guide to facilitate computational predictions of alternative protein
conformations, protein complex formation, and even the de novo prediction of
protein domain structures, provided the existence of a large number of
homologous sequences which are being rapidly made available due to advances in
genome sequencing.Comment: 28 pages, 7 figures, to appear in PNA
- …
