226 research outputs found

    Identification of direct residue contacts in protein-protein interaction by message passing

    Full text link
    Understanding the molecular determinants of specificity in protein-protein interaction is an outstanding challenge of postgenome biology. The availability of large protein databases generated from sequences of hundreds of bacterial genomes enables various statistical approaches to this problem. In this context covariance-based methods have been used to identify correlation between amino acid positions in interacting proteins. However, these methods have an important shortcoming, in that they cannot distinguish between directly and indirectly correlated residues. We developed a method that combines covariance analysis with global inference analysis, adopted from use in statistical physics. Applied to a set of >2,500 representatives of the bacterial two-component signal transduction system, the combination of covariance with global inference successfully and robustly identified residue pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor kinase (SK) and response regulator (RR) proteins and for homointeractions between RR proteins. The spectacular success of this approach illustrates the effectiveness of the global inference approach in identifying direct interaction based on sequence information alone. We expect this method to be applicable soon to interaction surfaces between proteins present in only 1 copy per genome as the number of sequenced genomes continues to expand. Use of this method could significantly increase the potential targets for therapeutic intervention, shed light on the mechanism of protein-protein interaction, and establish the foundation for the accurate prediction of interacting protein partners.Comment: Supplementary information available on http://www.pnas.org/content/106/1/67.abstrac

    Evolutionary analysis of mitochondrially encoded proteins of toad-headed lizards, Phrynocephalus, along an altitudinal gradient.

    Get PDF
    BACKGROUND: Animals living at high altitude must adapt to environments with hypoxia and low temperatures, but relatively little is known about underlying genetic changes. Toad-headed lizards of the genus Phrynocephalus cover a broad altitudinal gradient of over 4000 m and are useful models for studies of such adaptive responses. In one of the first studies to have considered selection on mitochondrial protein-coding regions in an ectothermic group distributed over such a wide range of environments, we analysed nineteen complete mitochondrial genomes from all Chinese Phrynocephalus (including eight genomes sequenced for the first time). Initial analyses used site and branch-site model (program: PAML) approaches to examine nonsynonymous: synonymous substitution rates across the mtDNA tree. RESULTS: Ten positively selected sites were discovered, nine of which corresponded to subunits ND2, ND3, ND4, ND5, and ND6 within the respiratory chain enzyme mitochondrial Complex I (NADH Coenzyme Q oxidoreductase). Four of these sites showed evidence of general long-term selection across the group while the remainder showed evidence of episodic selection across different branches of the tree. Some of these branches corresponded to increases in altitude and/or latitude. Analyses of physicochemical changes in protein structures revealed that residue changes at sites that were under selection corresponded to major functional differences. Analyses of coevolution point to coevolution of selected sites within the ND4 subunit, with key sites associated with proton translocation across the mitochondrial membrane. CONCLUSIONS: Our results identify mitochondrial Complex I as a target for environment-mediated selection in this group of lizards, a complex that frequently appears to be under selection in other organisms. This makes these lizards good candidates for more detailed future studies of molecular evolution

    Direct knock-on of desolvated ions governs strict ion selectivity in K+ channels

    Get PDF
    The seeming contradiction that K+ channels conduct K+ ions at maximal throughput rates while not permeating slightly smaller Na+ ions has perplexed scientists for decades. Although numerous models have addressed selective permeation in K+ channels, the combination of conduction efficiency and ion selectivity has not yet been linked through a unified functional model. Here, we investigate the mechanism of ion selectivity through atomistic simulations totalling more than 400 μs in length, which include over 7,000 permeation events. Together with free-energy calculations, our simulations show that both rapid permeation of K+ and ion selectivity are ultimately based on a single principle: the direct knock-on of completely desolvated ions in the channels' selectivity filter. Herein, the strong interactions between multiple 'naked' ions in the four filter binding sites give rise to a natural exclusion of any competing ions. Our results are in excellent agreement with experimental selectivity data, measured ion interaction energies and recent two-dimensional infrared spectra of filter ion configurations

    Emergence of terpene cyclization in Artemisia annua

    Get PDF
    The emergence of terpene cyclization was critical to the evolutionary expansion of chemical diversity yet remains unexplored. Here we report the first discovery of an epistatic network of residues that controls the onset of terpene cyclization in Artemisia annua. We begin with amorpha-4,11-diene synthase (ADS) and (E)-b-farnesene synthase (BFS), a pair of terpene synthases that produce cyclic or linear terpenes, respectively. A library of B27,000 enzymes is generated by breeding combinations of natural amino-acid substitutions from the cyclic into the linear producer. We discover one dominant mutation is sufficient to activate cyclization, and together with two additional residues comprise a network of strongly epistatic interactions that activate, suppress or reactivate cyclization. Remarkably, this epistatic network of equivalent residues also controls cyclization in a BFS homologue from Citrus junos. Fitness landscape analysis of mutational trajectories provides quantitative insights into a major epoch in specialized metabolism

    TrkA Undergoes a Tetramer-to-Dimer Conversion to Open TrkH Which Enables Changes in Membrane Potential

    Get PDF
    TrkH is a bacterial ion channel implicated in K+ uptake and pH regulation. TrkH assembles with its regulatory protein, TrkA, which closes the channel when bound to ADP and opens it when bound to ATP. However, it is unknown how nucleotides control the gating of TrkH through TrkA. Here we report the structures of the TrkH-TrkA complex in the presence of ADP or ATP. TrkA forms a tetrameric ring when bound to ADP and constrains TrkH to a closed conformation. The TrkA ring splits into two TrkA dimers in the presence of ATP and releases the constraints on TrkH, resulting in an open channel conformation. Functional studies show that both the tetramer-to-dimer conversion of TrkA and the loss of constraints on TrkH are required for channel gating. In addition, deletion of TrkA in Escherichia coli depolarizes the cell, suggesting that the TrkH-TrkA complex couples changes in intracellular nucleotides to membrane potential

    Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives

    Get PDF
    BACKGROUND: Predicting residues' contacts using primary amino acid sequence alone is an important task that can guide 3D structure modeling and can verify the quality of the predicted 3D structures. The correlated mutations (CM) method serves as the most promising approach and it has been used to predict amino acids pairs that are distant in the primary sequence but form contacts in the native 3D structure of homologous proteins. RESULTS: Here we report a new implementation of the CM method with an added set of selection rules (filters). The parameters of the algorithm were optimized against fifteen high resolution crystal structures with optimization criterion that maximized the confidentiality of the predictions. The optimization resulted in a true positive ratio (TPR) of 0.08 for the CM without filters and a TPR of 0.14 for the CM with filters. The protocol was further benchmarked against 65 high resolution structures that were not included in the optimization test. The benchmarking resulted in a TPR of 0.07 for the CM without filters and to a TPR of 0.09 for the CM with filters. CONCLUSION: Thus, the inclusion of selection rules resulted to an overall improvement of 30%. In addition, the pair-wise comparison of TPR for each protein without and with filters resulted in an average improvement of 1.7. The methodology was implemented into a web server that is freely available to the public. The purpose of this implementation is to provide the 3D structure predictors with a tool that can help with ranking alternative models by satisfying the largest number of predicted contacts, as well as it can provide a confidence score for contacts in cases where structure is known

    Pairwise maximum entropy models for studying large biological systems: when they can and when they can't work

    Get PDF
    One of the most critical problems we face in the study of biological systems is building accurate statistical descriptions of them. This problem has been particularly challenging because biological systems typically contain large numbers of interacting elements, which precludes the use of standard brute force approaches. Recently, though, several groups have reported that there may be an alternate strategy. The reports show that reliable statistical models can be built without knowledge of all the interactions in a system; instead, pairwise interactions can suffice. These findings, however, are based on the analysis of small subsystems. Here we ask whether the observations will generalize to systems of realistic size, that is, whether pairwise models will provide reliable descriptions of true biological systems. Our results show that, in most cases, they will not. The reason is that there is a crossover in the predictive power of pairwise models: If the size of the subsystem is below the crossover point, then the results have no predictive power for large systems. If the size is above the crossover point, the results do have predictive power. This work thus provides a general framework for determining the extent to which pairwise models can be used to predict the behavior of whole biological systems. Applied to neural data, the size of most systems studied so far is below the crossover point

    Disentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments

    Get PDF
    Predicting protein structure from primary sequence is one of the ultimate challenges in computational biology. Given the large amount of available sequence data, the analysis of co-evolution, i.e., statistical dependency, between columns in multiple alignments of protein domain sequences remains one of the most promising avenues for predicting residues that are contacting in the structure. A key impediment to this approach is that strong statistical dependencies are also observed for many residue pairs that are distal in the structure. Using a comprehensive analysis of protein domains with available three-dimensional structures we show that co-evolving contacts very commonly form chains that percolate through the protein structure, inducing indirect statistical dependencies between many distal pairs of residues. We characterize the distributions of length and spatial distance traveled by these co-evolving contact chains and show that they explain a large fraction of observed statistical dependencies between structurally distal pairs. We adapt a recently developed Bayesian network model into a rigorous procedure for disentangling direct from indirect statistical dependencies, and we demonstrate that this method not only successfully accomplishes this task, but also allows contacts with weak statistical dependency to be detected. To illustrate how additional information can be incorporated into our method, we incorporate a phylogenetic correction, and we develop an informative prior that takes into account that the probability for a pair of residues to contact depends strongly on their primary-sequence distance and the amount of conservation that the corresponding columns in the multiple alignment exhibit. We show that our model including these extensions dramatically improves the accuracy of contact prediction from multiple sequence alignments

    A novel synthesis and detection method for cap-associated adenosine modifications in mouse mRNA

    Get PDF
    A method is described for the detection of certain nucleotide modifications adjacent to the 5' 7-methyl guanosine cap of mRNAs from individual genes. The method quantitatively measures the relative abundance of 2'-O-methyl and N6,2'-O-dimethyladenosine, two of the most common modifications. In order to identify and quantitatify the amounts of N6,2'-O-dimethyladenosine, a novel method for the synthesis of modified adenosine phosphoramidites was developed. This method is a one step synthesis and the product can directly be used for the production of N6,2'-O-dimethyladenosine containing RNA oligonucleotides. The nature of the cap-adjacent nucleotides were shown to be characteristic for mRNAs from individual genes transcribed in liver and testis

    Direct-coupling analysis of residue co-evolution captures native contacts across many protein families

    Full text link
    The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced Direct Coupling Analysis (DCA) (Weigt et al. (2009) Proc Natl Acad Sci 106:67). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intra- domain residue contacts, arising, e.g., from alternative protein conformations, ligand- mediated residue couplings, and inter-domain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, provided the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.Comment: 28 pages, 7 figures, to appear in PNA
    corecore