3,750 research outputs found

    ISSD Version 2.0: taxonomic range extended

    Get PDF
    Two more organisms from different taxonomic groups were added to a new version of the integrated Sequence-Structure Database (ISSD). ISSD serves as an integrated source of sequence and structure information for the analysis of correlations between mRNA synonymous codon usage and threedimensional structure of the encoded proteins. ISSD now holds 88 non-homologous Escherichia coli proteins and 25 yeast Saccharomyces cerevisiae proteins in addition to the expanded set of mammalian proteins, which includes 166 proteins (107 in ISSD Version 1.0). Comparison of ISSD sequences with organism-specific codon usage data derived from CUTG database shows that it is a representative subset of the genbank coding sequences data. Preliminary results of the statistical analysis confirm that sequence-structure correlations observed by us earlier are also present in the upgraded ISSD (Version 2.0), including bacterial and yeast proteins. The ISSD version 2.0 release includes an improved web-based data search and retrieval system and is accessible via URL http://www.protein.bio.msu.su/issd/. ISSD can be also accessed at ExPASy, URL http://www.expasy.ch/swissmod/swiss-model.htm

    Bioactivity and structural properties of chimeric analogs of the starfish SALMFamide neuropeptides S1 and S2

    Get PDF
    The starfish SALMFamide neuropeptides S1 (GFNSALMFamide) and S2 (SGPYSFNSGLTFamide) are the prototypical members of a family of neuropeptides that act as muscle relaxants in echinoderms. Comparison of the bioactivity of S1 and S2 as muscle relaxants has revealed that S2 is ten times more potent than S1. Here we investigated a structural basis for this difference in potency by comparing the bioactivity and solution conformations (using NMR and CD spectroscopy) of S1 and S2 with three chimeric analogs of these peptides. A peptide comprising S1 with the addition of S2's N-terminal tetrapeptide (Long S1 or LS1; SGPYGFNSALMFamide) was not significantly different to S1 in its bioactivity and did not exhibit concentration-dependent structuring seen with S2. An analog of S1with its penultimate residue substituted from S2 (S1(T); GFNSALTFamide) exhibited S1-like bioactivity and structure. However, an analog of S2 with its penultimate residue substituted from S1 (S2(M); SGPYSFNSGLMFamide) exhibited loss of S2-type bioactivity and structural properties. Collectively, our data indicate that the C-terminal regions of S1 and S2 are the key determinants of their differing bioactivity. However, the N-terminal region of S2 may influence its bioactivity by conferring structural stability in solution. Thus, analysis of chimeric SALMFamides has revealed how neuropeptide bioactivity is determined by a complex interplay of sequence and conformation

    Transcriptomic insights into genetic diversity of protein-coding genes in X. laevis

    Get PDF
    © The Author(s), 2017. This is the author's version of the work and is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Developmental Biology 424 (2017): 181-188, doi:10.1016/j.ydbio.2017.02.019We characterize the genetic diversity of Xenopus laevis strains using RNA-seq data and allele- specific analysis. This data provides a catalogue of coding variation, which can be used for improving the genomic sequence, as well as for better sequence alignment, probe design, and proteomic analysis. In addition, we paint a broad picture of the genetic landscape of the species by functionally annotating different classes of mutations with a well-established prediction tool (PolyPhen-2). Further, we specifically compare the variation in the progeny of four crosses: inbred genomic (J)- strain, outbred albino (B)-strain, and two hybrid crosses of J and B strains. We identify a subset of mutations specific to the B strain, which allows us to investigate the selection pressures affecting duplicated genes in this allotetraploid. From these crosses we find the ratio of non-synonymous to synonymous mutations is lower in duplicated genes, which suggests that they are under greater purifying selection. Surprisingly, we also find that function-altering ("damaging") mutations constitute a greater fraction of the non-synonymous variants in this group, which suggests a role for subfunctionalization in coding variation affecting duplicated genes.L.P. was supported by the NIH grant R01HD073104, also L.P., A.N. and V.S. were supported by R21HD81675, M.H. and E.P. by P40 OD010997.2018-03-0

    HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants

    Get PDF
    The resolution of genome-wide association studies (GWAS) is limited by the linkage disequilibrium (LD) structure of the population being studied. Selecting the most likely causal variants within an LD block is relatively straightforward within coding sequence, but is more difficult when all variants are intergenic. Predicting functional non-coding sequence has been recently facilitated by the availability of conservation and epigenomic information. We present HaploReg, a tool for exploring annotations of the non-coding genome among the results of published GWAS or novel sets of variants. Using LD information from the 1000 Genomes Project, linked SNPs and small indels can be visualized along with their predicted chromatin state in nine cell types, conservation across mammals and their effect on regulatory motifs. Sets of SNPs, such as those resulting from GWAS, are analyzed for an enrichment of cell type-specific enhancers. HaploReg will be useful to researchers developing mechanistic hypotheses of the impact of non-coding variants on clinical phenotypes and normal variation. The HaploReg database is available at http://compbio.mit.edu/HaploReg.National Institutes of Health (U.S.) (R01-HG004037)National Institutes of Health (U.S.) (RC1-HG005334)National Science Foundation (U.S.) (HG005334

    De novo mutations in SMCHD1 cause Bosma arhinia microphthalmia syndrome and abrogate nasal development

    Get PDF
    Bosma arhinia microphthalmia syndrome (BAMS) is an extremely rare and striking condition characterized by complete absence of the nose with or without ocular defects. We report here that missense mutations in the epigenetic regulator SMCHD1 mapping to the extended ATPase domain of the encoded protein cause BAMS in all 14 cases studied. All mutations were de novo where parental DNA was available. Biochemical tests and in vivo assays in Xenopus laevis embryos suggest that these mutations may behave as gain-of-function alleles. This finding is in contrast to the loss-of-function mutations in SMCHD1 that have been associated with facioscapulohumeral muscular dystrophy (FSHD) type 2. Our results establish SMCHD1 as a key player in nasal development and provide biochemical insight into its enzymatic function that may be exploited for development of therapeutics for FSHD

    Establishing the precise evolutionary history of a gene improves prediction of disease-causing missense mutations

    Get PDF
    PURPOSE: Predicting the phenotypic effects of mutations has become an important application in clinical genetic diagnostics. Computational tools evaluate the behavior of the variant over evolutionary time and assume that variations seen during the course of evolution are probably benign in humans. However, current tools do not take into account orthologous/paralogous relationships. Paralogs have dramatically different roles in Mendelian diseases. For example, whereas inactivating mutations in the NPC1 gene cause the neurodegenerative disorder Niemann-Pick C, inactivating mutations in its paralog NPC1L1 are not disease-causing and, moreover, are implicated in protection from coronary heart disease. METHODS: We identified major events in NPC1 evolution and revealed and compared orthologs and paralogs of the human NPC1 gene through phylogenetic and protein sequence analyses. We predicted whether an amino acid substitution affects protein function by reducing the organism’s fitness. RESULTS: Removing the paralogs and distant homologs improved the overall performance of categorizing disease-causing and benign amino acid substitutions. CONCLUSION: The results show that a thorough evolutionary analysis followed by identification of orthologs improves the accuracy in predicting disease-causing missense mutations. We anticipate that this approach will be used as a reference in the interpretation of variants in other genetic diseases as well. Genet Med 18 10, 1029–1036

    mutation3D:Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome

    Get PDF
    A new algorithm and Web server, mutation3D (http://mutation3d.org), proposes driver genes in cancer by identifying clusters of amino acid substitutions within tertiary protein structures. We demonstrate the feasibility of using a 3D clustering approach to implicate proteins in cancer based on explorations of single proteins using the mutation3D Web interface. On a large scale, we show that clustering with mutation3D is able to separate functional from nonfunctional mutations by analyzing a combination of 8,869 known inherited disease mutations and 2,004 SNPs overlaid together upon the same sets of crystal structures and homology models. Further, we present a systematic analysis of whole-genome and whole-exome cancer datasets to demonstrate that mutation3D identifies many known cancer genes as well as previously underexplored target genes. The mutation3D Web interface allows users to analyze their own mutation data in a variety of popular formats and provides seamless access to explore mutation clusters derived from over 975,000 somatic mutations reported by 6,811 cancer sequencing studies. The mutation3D Web interface is freely available with all major browsers supported

    The Phyre2 web portal for protein modeling, prediction and analysis

    Get PDF
    Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission

    Deriving a mutation index of carcinogenicity using protein structure and protein interfaces

    Get PDF
    With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/
    corecore