476 research outputs found

    The RNA Helicase DDX6 Controls Cellular Plasticity by Modulating P-Body Homeostasis

    Get PDF
    Post-transcriptional mechanisms have the potential to influence complex changes in gene expression, yet their role in cell fate transitions remains largely unexplored. Here, we show that suppression of the RNA helicase DDX6 endows human and mouse primed embryonic stem cells (ESCs) with a differentiation-resistant, “hyper-pluripotent” state, which readily reprograms to a naive state resembling the preimplantation embryo. We further demonstrate that DDX6 plays a key role in adult progenitors where it controls the balance between self-renewal and differentiation in a context-dependent manner. Mechanistically, DDX6 mediates the translational suppression of target mRNAs in P-bodies. Upon loss of DDX6 activity, P-bodies dissolve and release mRNAs encoding fate-instructive transcription and chromatin factors that re-enter the ribosome pool. Increased translation of these targets impacts cell fate by rewiring the enhancer, heterochromatin, and DNA methylation landscapes of undifferentiated cell types. Collectively, our data establish a link between P-body homeostasis, chromatin organization, and stem cell potency

    PROCAIN server for remote protein sequence similarity search

    Get PDF
    Sensitive and accurate detection of distant protein homology is essential for the studies of protein structure, function and evolution. We recently developed PROCAIN, a method that is based on sequence profile comparison and involves the analysis of four signals—similarities of residue content at the profile positions combined with three types of assisting information: sequence motifs, residue conservation and predicted secondary structure. Here we present the PROCAIN web server that allows the user to submit a query sequence or multiple sequence alignment and perform the search in a profile database of choice. The output is structured similar to that of BLAST, with the list of detected homologs sorted by E-value and followed by profile–profile alignments. The front page allows the user to adjust multiple options of input processing and output formatting, as well as search settings, including the relative weights assigned to the three types of assisting information

    Estimates of statistical significance for comparison of individual positions in multiple sequence alignments

    Get PDF
    BACKGROUND: Profile-based analysis of multiple sequence alignments (MSA) allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1) MSA position and a set of predicted residue frequencies, and (2) between two MSA positions. These problems are important for (i) evaluation and optimization of methods predicting residue occurrence at protein positions; (ii) detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii) detection of sites that determine functional or structural specificity in two related families. RESULTS: For problems (1) and (2), we propose analytical estimates of P-value and apply them to the detection of significant positional dissimilarities in various experimental situations. (a) We compare structure-based predictions of residue propensities at a protein position to the actual residue frequencies in the MSA of homologs. (b) We evaluate our method by the ability to detect erroneous position matches produced by an automatic sequence aligner. (c) We compare MSA positions that correspond to residues aligned by automatic structure aligners. (d) We compare MSA positions that are aligned by high-quality manual superposition of structures. Detected dissimilarities reveal shortcomings of the automatic methods for residue frequency prediction and alignment construction. For the high-quality structural alignments, the dissimilarities suggest sites of potential functional or structural importance. CONCLUSION: The proposed computational method is of significant potential value for the analysis of protein families

    Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds

    Get PDF
    BACKGROUND: As tertiary structure is currently available only for a fraction of known protein families, it is important to assess what parts of sequence space have been structurally characterized. We consider protein domains whose structure can be predicted by sequence similarity to proteins with solved structure and address the following questions. Do these domains represent an unbiased random sample of all sequence families? Do targets solved by structural genomic initiatives (SGI) provide such a sample? What are approximate total numbers of structure-based superfamilies and folds among soluble globular domains? RESULTS: To make these assessments, we combine two approaches: (i) sequence analysis and homology-based structure prediction for proteins from complete genomes; and (ii) monitoring dynamics of the assigned structure set in time, with the accumulation of experimentally solved structures. In the Clusters of Orthologous Groups (COG) database, we map the growing population of structurally characterized domain families onto the network of sequence-based connections between domains. This mapping reveals a systematic bias suggesting that target families for structure determination tend to be located in highly populated areas of sequence space. In contrast, the subset of domains whose structure is initially inferred by SGI is similar to a random sample from the whole population. To accommodate for the observed bias, we propose a new non-parametric approach to the estimation of the total numbers of structural superfamilies and folds, which does not rely on a specific model of the sampling process. Based on dynamics of robust distribution-based parameters in the growing set of structure predictions, we estimate the total numbers of superfamilies and folds among soluble globular proteins in the COG database. CONCLUSION: The set of currently solved protein structures allows for structure prediction in approximately a third of sequence-based domain families. The choice of targets for structure determination is biased towards domains with many sequence-based homologs. The growing SGI output in the future should further contribute to the reduction of this bias. The total number of structural superfamilies and folds in the COG database are estimated as ~4000 and ~1700. These numbers are respectively four and three times higher than the numbers of superfamilies and folds that can currently be assigned to COG proteins

    A tale of two ferredoxins: sequence similarity and structural differences

    Get PDF
    BACKGROUND: Sequence similarity between proteins is usually considered a reliable indicator of homology. Pyruvate-ferredoxin oxidoreductase and quinol-fumarate reductase contain ferredoxin domains that bind [Fe-S] clusters and are involved in electron transport. Profile-based methods for sequence comparison, such as PSI-BLAST and HMMer, suggest statistically significant similarity between these domains. RESULTS: The sequence similarity between these ferredoxin domains resides in the area of the [Fe-S] cluster-binding sites. Although overall folds of these ferredoxins bear no obvious similarity, the regions of sequence similarity display a remarkable local structural similarity. These short regions with pronounced sequence motifs are incorporated in completely different structural environments. In pyruvate-ferredoxin oxidoreductase (bacterial ferredoxin), the hydrophobic core of the domain is completed by two β-hairpins, whereas in quinol-fumarate reductase (α-helical ferredoxin), the cluster-binding motifs are part of a larger all-α-helical globin-like fold core. CONCLUSION: Functionally meaningful sequence similarity may sometimes be reflected only in local structural similarity, but not in global fold similarity. If detected and used naively, such similarities may lead to incorrect fold predictions
    corecore