458 research outputs found
A systematic analysis of scoring functions in rigid-body protein docking: The delicate balance between the predictive rate improvement and the risk of overtraining
Protein-protein interactions play fundamental roles in biological processes including signaling, metabolism, and trafficking. While the structure of a protein complex reveals crucial details about the interaction, it is often difficult to acquire this information experimentally. As the number of interactions discovered increases faster than they can be characterized, protein-protein docking calculations may be able to reduce this disparity by providing models of the interacting proteins. Rigid-body docking is a widely used docking approach, and is often capable of generating a pool of models within which a near-native structure can be found. These models need to be scored in order to select the acceptable ones from the set of poses. Recently, more than 100 scoring functions from the CCharPPI server were evaluated for this task using decoy structures generated with SwarmDock. Here, we extend this analysis to identify the predictive success rates of the scoring functions on decoys from three rigid-body docking programs, ZDOCK, FTDock, and SDOCK, allowing us to assess the transferability of the functions. We also apply set-theoretic measure to test whether the scoring functions are capable of identifying near-native poses within different subsets of the benchmark. This information can provide guides for the use of the most efficient scoring function for each docking method, as well as instruct future scoring functions development efforts.Grant sponsor: MINECO BIO2013-48213-R; Grant sponsor: CONACyT (D.B.-B.);
Grant sponsor: EC FP7-PEOPLE (I.H.M.) PIEF-GA-2012-327899; Grant sponsor:
BSRC (I.H.M.) BB/N011600/1.Peer ReviewedPostprint (author's final draft
An Exact Algorithm for Side-Chain Placement in Protein Design
Computational protein design aims at constructing novel or improved functions
on the structure of a given protein backbone and has important applications in
the pharmaceutical and biotechnical industry. The underlying combinatorial
side-chain placement problem consists of choosing a side-chain placement for
each residue position such that the resulting overall energy is minimum. The
choice of the side-chain then also determines the amino acid for this position.
Many algorithms for this NP-hard problem have been proposed in the context of
homology modeling, which, however, reach their limits when faced with large
protein design instances.
In this paper, we propose a new exact method for the side-chain placement
problem that works well even for large instance sizes as they appear in protein
design. Our main contribution is a dedicated branch-and-bound algorithm that
combines tight upper and lower bounds resulting from a novel Lagrangian
relaxation approach for side-chain placement. Our experimental results show
that our method outperforms alternative state-of-the art exact approaches and
makes it possible to optimally solve large protein design instances routinely
The Phyre2 web portal for protein modeling, prediction and analysis
Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission
TMFoldRec: a statistical potential-based transmembrane protein fold recognition tool.
BACKGROUND: Transmembrane proteins (TMPs) are the key components of signal transduction, cell-cell adhesion and energy and material transport into and out from the cells. For the deep understanding of these processes, structure determination of transmembrane proteins is indispensable. However, due to technical difficulties, only a few transmembrane protein structures have been determined experimentally. Large-scale genomic sequencing provides increasing amounts of sequence information on the proteins and whole proteomes of living organisms resulting in the challenge of bioinformatics; how the structural information should be gained from a sequence. RESULTS: Here, we present a novel method, TMFoldRec, for fold prediction of membrane segments in transmembrane proteins. TMFoldRec based on statistical potentials was tested on a benchmark set containing 124 TMP chains from the PDBTM database. Using a 10-fold jackknife method, the native folds were correctly identified in 77 % of the cases. This accuracy overcomes the state-of-the-art methods. In addition, a key feature of TMFoldRec algorithm is the ability to estimate the reliability of the prediction and to decide with an accuracy of 70 %, whether the obtained, lowest energy structure is the native one. CONCLUSION: These results imply that the membrane embedded parts of TMPs dictate the TM structures rather than the soluble parts. Moreover, predictions with reliability scores make in this way our algorithm applicable for proteome-wide analyses. AVAILABILITY: The program is available upon request for academic use
SIMS: A Hybrid Method for Rapid Conformational Analysis
Proteins are at the root of many biological functions, often performing complex tasks as the result of large changes in their
structure. Describing the exact details of these conformational changes, however, remains a central challenge for
computational biology due the enormous computational requirements of the problem. This has engendered the
development of a rich variety of useful methods designed to answer specific questions at different levels of spatial,
temporal, and energetic resolution. These methods fall largely into two classes: physically accurate, but computationally
demanding methods and fast, approximate methods. We introduce here a new hybrid modeling tool, the Structured
Intuitive Move Selector (SIMS), designed to bridge the divide between these two classes, while allowing the benefits of both
to be seamlessly integrated into a single framework. This is achieved by applying a modern motion planning algorithm,
borrowed from the field of robotics, in tandem with a well-established protein modeling library. SIMS can combine precise
energy calculations with approximate or specialized conformational sampling routines to produce rapid, yet accurate,
analysis of the large-scale conformational variability of protein systems. Several key advancements are shown, including the
abstract use of generically defined moves (conformational sampling methods) and an expansive probabilistic
conformational exploration. We present three example problems that SIMS is applied to and demonstrate a rapid solution
for each. These include the automatic determination of ムムactiveメメ residues for the hinge-based system Cyanovirin-N,
exploring conformational changes involving long-range coordinated motion between non-sequential residues in Ribose-
Binding Protein, and the rapid discovery of a transient conformational state of Maltose-Binding Protein, previously only
determined by Molecular Dynamics. For all cases we provide energetic validations using well-established energy fields,
demonstrating this framework as a fast and accurate tool for the analysis of a wide range of protein flexibility problems
Recombinant nucleases CEL I from celery and SP I from spinach for mutation detection
<p>Abstract</p> <p>Background</p> <p>The detection of unknown mutations is important in research and medicine. For this purpose, a mismatch-specific endonuclease CEL I from celery has been established as a useful tool in high throughput projects. Previously, CEL I-like activities were described only in a variety of plants and could not be expressed in an active form in bacteria.</p> <p>Results</p> <p>We describe expression of active recombinant plant mismatch endonucleases and modification of their activities. We also report the cloning of a CEL I ortholog from <it>Spinacia oleracea </it>(spinach) which we termed SP I nuclease. Active CEL I and SP I nucleases were expressed as C-terminal hexahistidine fusions and affinity purified from the cell culture media. Both recombinant enzymes were active in mutation detection in <it>BRCA1 </it>gene of patient-derived DNA. Native SP nuclease purified from spinach is unable to incise at single-nucleotide substitutions and loops containing a guanine nucleotide, but the recombinant SP I nuclease can cut at these sites.</p> <p>Conclusion</p> <p>The insect cell-expressed CEL I orthologs may not be identical to their native counterparts purified from plant tissues. The present expression system should facilitate further development of CEL I-based mutation detection technologies.</p
CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data
CS23D (chemical shift to 3D structure) is a web server for rapidly generating accurate 3D protein structures using only assigned nuclear magnetic resonance (NMR) chemical shifts and sequence data as input. Unlike conventional NMR methods, CS23D requires no NOE and/or J-coupling data to perform its calculations. CS23D accepts chemical shift files in either SHIFTY or BMRB formats, and produces a set of PDB coordinates for the protein in about 10–15 min. CS23D uses a pipeline of several preexisting programs or servers to calculate the actual protein structure. Depending on the sequence similarity (or lack thereof) CS23D uses either (i) maximal subfragment assembly (a form of homology modeling), (ii) chemical shift threading or (iii) shift-aided de novo structure prediction (via Rosetta) followed by chemical shift refinement to generate and/or refine protein coordinates. Tests conducted on more than 100 proteins from the BioMagResBank indicate that CS23D converges (i.e. finds a solution) for >95% of protein queries. These chemical shift generated structures were found to be within 0.2–2.8 Å RMSD of the NMR structure generated using conventional NOE-base NMR methods or conventional X-ray methods. The performance of CS23D is dependent on the completeness of the chemical shift assignments and the similarity of the query protein to known 3D folds. CS23D is accessible at http://www.cs23d.ca
- …
