137 research outputs found

    The Dundee Resource for Sequence Analysis and Structure Prediction

    Get PDF
    The Dundee Resource for Sequence Analysis and Structure Prediction (DRSASP; http://www.compbio.dundee.ac.uk/drsasp.html) is a collection of web services provided by the Barton Group at the University of Dundee. DRSASP's flagship services are the JPred4 webserver for secondary structure and solvent accessibility prediction and the JABAWS 2.2 webserver for multiple sequence alignment, disorder prediction, amino acid conservation calculations, and specificity-determining site prediction. DRSASP resources are available through conventional web interfaces and APIs but are also integrated into the Jalview sequence analysis workbench, which enables the composition of multitool interactive workflows. Other existing Barton Group tools are being brought under the banner of DRSASP, including NoD (Nucleolar localization sequence detector) and 14-3-3-Pred. New resources are being developed that enable the analysis of population genetic data in evolutionary and 3D structural contexts. Existing resources are actively developed to exploit new technologies and maintain parity with evolving web standards. DRSASP provides substantial computational resources for public use, and since 2016 DRSASP services have completed over 1.5 million jobs.</p

    LIGYSIS-web:a resource for the analysis of protein-ligand binding sites

    Get PDF
    LIGYSIS-web is a free website accessible to all users without any login requirement for the analysis of protein-ligand binding sites. LIGYSIS-web hosts a database of 65,000 protein-ligand binding sites across 25,000 proteins. LIGYSIS sites are defined by aggregating unique relevant protein-ligand interfaces across different biological assemblies of the same protein deposited on the PDBe. Additionally, users can upload their own structures in PDB or mmCIF format for analysis and subsequent visualisation and download. Ligand sites are characterised using evolutionary divergence from a multiple sequence alignment, human missense genetic variation from gnomAD and relative solvent accessibility to obtain accessibility-based cluster labels and scores indicating likelihood of function. These results are displayed in the LIGYSIS web server, a Python Flask web application with a JavaScript frontend employing Jinja and jQuery to link the 3Dmol.js structure viewer with dynamic tables and Chart.js graphs in an interactive manner. LIGYSIS-web is available at https://www.compbio.dundee.ac.uk/ligysis/ whilst the source code for the analysis pipelines and web application can be accessed at https://github.com/bartongroup/LIGYSIS, https://github.com/bartongroup/LIGYSIS-custom, and https://github.com/bartongroup/LIGYSIS-web, respectively

    Disease related single point mutations alter the global dynamics of a tetratricopeptide (TPR) α-solenoid domain

    Get PDF
    Tetratricopeptide repeat (TPR) proteins belong to the class of α-solenoid proteins, in which repetitive units of α-helical hairpin motifs stack to form superhelical, often highly flexible structures. TPR domains occur in a wide variety of proteins, and perform key functional roles including protein folding, protein trafficking, cell cycle control and post translational modification. Here, we look at the TPR domain of the enzyme O-linked GlcNAc-transferase (OGT), which catalyses O-GlcNAcylation of a broad range of substrate proteins. A number of single-point mutations in the TPR domain of human OGT have been associated with the disease Intellectual Disability (ID). By extended steered and equilibrium atomistic simulations, we show that the OGT-TPR domain acts as an elastic nanospring, and that each of the IDrelated local mutations substantially affect the global dynamics of the TPR domain. Since the nanospring character of the OGT-TPR domain is key to its function in binding and releasing OGT substrates, these changes of its biomechanics likely lead to defective substrate interaction. We find that neutral mutations in the human population, selected by analysis of the gnomAD database, do not incur these changes. Our findings may not only help to explain the ID phenotype of the mutants, but also aid the design of TPR proteins with tailored biomechanical properties

    A unified analysis of evolutionary and population constraint in protein domains highlights structural features and pathogenic sites

    Get PDF
    Protein evolution is constrained by structure and function, creating patterns in residue conservation that are routinely exploited to predict structure and other features. Similar constraints should affect variation across individuals, but it is only with the growth of human population sequencing that this has been tested at scale. Now, human population constraint has established applications in pathogenicity prediction, but it has not yet been explored for structural inference. Here, we map 2.4 million population variants to 5885 protein families and quantify residue-level constraint with a new Missense Enrichment Score (MES). Analysis of 61,214 structures from the PDB spanning 3661 families shows that missense depleted sites are enriched in buried residues or those involved in small-molecule or protein binding. MES is complementary to evolutionary conservation and a combined analysis allows a new classification of residues according to a conservation plane. This approach finds functional residues that are evolutionarily diverse, which can be related to specificity, as well as family-wide conserved sites that are critical for folding or function. We also find a possible contrast between lethal and non-lethal pathogenic sites, and a surprising clinical variant hot spot at a subset of missense enriched positions

    Classification of likely functional class for ligand binding sites identified from fragment screening

    Get PDF
    Fragment screening is used to identify binding sites and leads in drug discovery, but it is often unclear which binding sites are functionally important. Here, data from 37 experiments, and 1309 protein structures binding to 1601 ligands were analysed. A method to group ligands by binding sites is introduced and sites clustered according to profiles of relative solvent accessibility. This identified 293 unique ligand binding sites, grouped into four clusters (C1-4). C1 includes larger, buried, conserved, and population missense-depleted sites, enriched in known functional sites. C4 comprises smaller, accessible, divergent, missense-enriched sites, depleted in functional sites. A site in C1 is 28 times more likely to be functional than one in C4. Seventeen sites, which to the best of our knowledge are novel, in 13 proteins are identified as likely to be functionally important with examples from human tenascin and 5-aminolevulinate synthase highlighted. A multi-layer perceptron, and K-nearest neighbours model are presented to predict cluster labels for ligand binding sites with an accuracy of 96% and 100%, respectively, so allowing functional classification of sites for proteins not in this set. Our findings will be of interest to those studying protein-ligand interactions and developing new drugs or function modulators

    Classification of likely functional class for ligand binding sites identified from fragment screening

    Get PDF
    Fragment screening is used to identify binding sites and leads in drug discovery, but it is often unclear which binding sites are functionally important. Here, data from 37 experiments, and 1309 protein structures binding to 1601 ligands were analysed. A method to group ligands by binding sites is introduced and sites clustered according to profiles of relative solvent accessibility. This identified 293 unique ligand binding sites, grouped into four clusters (C1-4). C1 includes larger, buried, conserved, and population missense-depleted sites, enriched in known functional sites. C4 comprises smaller, accessible, divergent, missense-enriched sites, depleted in functional sites. A site in C1 is 28 times more likely to be functional than one in C4. Seventeen sites, which to the best of our knowledge are novel, in 13 proteins are identified as likely to be functionally important with examples from human tenascin and 5-aminolevulinate synthase highlighted. A multi-layer perceptron, and K-nearest neighbours model are presented to predict cluster labels for ligand binding sites with an accuracy of 96% and 100%, respectively, so allowing functional classification of sites for proteins not in this set. Our findings will be of interest to those studying protein-ligand interactions and developing new drugs or function modulators

    Ankyrin repeats in context with human population variation

    Get PDF
    Ankyrin protein repeats bind to a wide range of substrates and are one of the most common protein motifs in nature. Here, we collate a high-quality alignment of 7,407 ankyrin repeats and examine for the first time, the distribution of human population variants from large-scale sequencing of healthy individuals across this family. Population variants are not randomly distributed across the genome but are constrained by gene essentiality and function. Accordingly, we interpret the population variants in context with evolutionary constraint and structural features including secondary structure, accessibility and protein-protein interactions across 383 three-dimensional structures of ankyrin repeats. We find five positions that are highly conserved across homologues and also depleted in missense variants within the human population. These positions are significantly enriched in intra-domain contacts and so likely to be key for repeat packing. In contrast, a group of evolutionarily divergent positions are found to be depleted in missense variants in human and significantly enriched in protein-protein interactions. Our analysis also suggests the domain has three, not two surfaces, each with different patterns of enrichment in protein-substrate interactions and missense variants. Our findings will be of interest to those studying or engineering ankyrin-repeat containing proteins as well as those interpreting the significance of disease variants

    Effects of common mutations in the SARS-CoV-2 Spike RBD domain and its ligand the human ACE2 receptor on binding affinity and kinetics

    Get PDF
    The interaction between the SARS-CoV-2 virus Spike protein receptor binding domain (RBD) and the ACE2 cell surface protein is required for viral infection of cells. Mutations in the RBD are present in SARS-CoV-2 variants of concern that have emerged independently worldwide. For example, the B.1.1.7 lineage has a mutation (N501Y) in its Spike RBD that enhances binding to ACE2. There are also ACE2 alleles in humans with mutations in the RBD binding site. Here we perform a detailed affinity and kinetics analysis of the effect of five common RBD mutations (K417N, K417T, N501Y, E484K, and S477N) and two common ACE2 mutations (S19P and K26R) on the RBD/ACE2 interaction. We analysed the effects of individual RBD mutations and combinations found in new SARS-CoV-2 Alpha (B.1.1.7), Beta (B.1.351), and Gamma (P1) variants. Most of these mutations increased the affinity of the RBD/ACE2 interaction. The exceptions were mutations K417N/T, which decreased the affinity. Taken together with other studies, our results suggest that the N501Y and S477N mutations enhance transmission primarily by enhancing binding, the K417N/T mutations facilitate immune escape, and the E484K mutation enhances binding and immune escape
    corecore