152 research outputs found

    The Caenorhabditis elegans Gene mfap-1 Encodes a Nuclear Protein That Affects Alternative Splicing

    Get PDF
    RNA splicing is a major regulatory mechanism for controlling eukaryotic gene expression. By generating various splice isoforms from a single pre–mRNA, alternative splicing plays a key role in promoting the evolving complexity of metazoans. Numerous splicing factors have been identified. However, the in vivo functions of many splicing factors remain to be understood. In vivo studies are essential for understanding the molecular mechanisms of RNA splicing and the biology of numerous RNA splicing-related diseases. We previously isolated a Caenorhabditis elegans mutant defective in an essential gene from a genetic screen for suppressors of the rubberband Unc phenotype of unc-93(e1500) animals. This mutant contains missense mutations in two adjacent codons of the C. elegans microfibrillar-associated protein 1 gene mfap-1. mfap-1(n4564 n5214) suppresses the Unc phenotypes of different rubberband Unc mutants in a pattern similar to that of mutations in the splicing factor genes uaf-1 (the C. elegans U2AF large subunit gene) and sfa-1 (the C. elegans SF1/BBP gene). We used the endogenous gene tos-1 as a reporter for splicing and detected increased intron 1 retention and exon 3 skipping of tos-1 transcripts in mfap-1(n4564 n5214) animals. Using a yeast two-hybrid screen, we isolated splicing factors as potential MFAP-1 interactors. Our studies indicate that C. elegans mfap-1 encodes a splicing factor that can affect alternative splicing.National Natural Science Foundation (China) (Grant 30971639)United States. National Institutes of Health (Grant GM24663

    Combining Computational Prediction of Cis-Regulatory Elements with a New Enhancer Assay to Efficiently Label Neuronal Structures in the Medaka Fish

    Get PDF
    The developing vertebrate nervous system contains a remarkable array of neural cells organized into complex, evolutionarily conserved structures. The labeling of living cells in these structures is key for the understanding of brain development and function, yet the generation of stable lines expressing reporter genes in specific spatio-temporal patterns remains a limiting step. In this study we present a fast and reliable pipeline to efficiently generate a set of stable lines expressing a reporter gene in multiple neuronal structures in the developing nervous system in medaka. The pipeline combines both the accurate computational genome-wide prediction of neuronal specific cis-regulatory modules (CRMs) and a newly developed experimental setup to rapidly obtain transgenic lines in a cost-effective and highly reproducible manner. 95% of the CRMs tested in our experimental setup show enhancer activity in various and numerous neuronal structures belonging to all major brain subdivisions. This pipeline represents a significant step towards the dissection of embryonic neuronal development in vertebrates

    WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Get PDF
    BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. RESULTS: We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. CONCLUSION: Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes

    An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome

    Get PDF
    In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/

    Cognitive and psychosocial development of HIV pediatric patients receiving highly active anti-retroviral therapy: a case-control study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The psychosocial development of pediatric HIV patients has not been extensively evaluated. The study objectives were to evaluate whether emotional and social functions are differentially associated with HIV-related complications.</p> <p>Methods</p> <p>A matched case-control study design was conducted. The case group (n = 20) consisted of vertically infected children with HIV (aged 3-18 years) receiving HAART in Greece. Each case was matched with two randomly selected healthy controls from a school-based population. CNS imaging and clinical findings were used to identify patients with HIV-related neuroimaging abnormalities. The Wechsler Intelligence Scale III and Griffiths Mental Abilities Scales were applied to assess cognitive abilities. The age specific Strengths and Difficulties Questionnaire was used to evaluate emotional adjustment and social skills. The Fisher's exact test, student's t-test, and Wilcoxon rank sum test were used to compare categorical, continuous, and ordinal scores, respectively, of the above scales between groups.</p> <p>Results</p> <p>HIV patients without neuroimaging abnormalities did not differ from patients with neuroimaging abnormalities with respect to either age at HAART initiation (p = 0.306) or months of HAART treatment (p = 0.964). While HIV patients without neuroimaging abnormalities had similar cognitive development with their healthy peers, patients with neuroimaging abnormalities had lower mean General (p = 0.027) and Practical (p = 0.042) Intelligence Quotient scores. HIV patients without neuroimaging abnormalities had an increased likelihood of both Abnormal Emotional Symptoms (p = 0.047) and Hyperactivity scores (p = 0.0009). In contrast, HIV patients with neuroimaging abnormalities had an increased likelihood of presenting with Abnormal Peer Problems (p = 0.033).</p> <p>Conclusions</p> <p>HIV patients without neuroimaging abnormalities are more likely to experience maladjustment with respect to their emotional and activity spheres, while HIV patients with neuroimaging abnormalities are more likely to present with compromised social skills. Due to the limited sample size and age distribution of the study population, further studies should investigate the psychosocial development of pediatric HIV patients following the disclosure of their condition.</p

    A combinatorial optimization approach for diverse motif finding applications

    Get PDF
    BACKGROUND: Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied. RESULTS: We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem, including those incorporating substitution matrices and phylogenetic distances. Additionally, we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets, we demonstrate that our approach typically identifies statistically significant motifs corresponding to either known motifs or other motifs of high conservation. Moreover, in most cases, our approach finds provably optimal solutions to the underlying optimization problem. CONCLUSION: Our results demonstrate that a combined graph theoretic and mathematical programming approach can be the basis for effective and powerful techniques for diverse motif finding applications

    Using direct observations on multiple occasions to measure household food availability among low-income Mexicano residents in Texas colonias

    Get PDF
    BACKGROUND: It has been recognized that the availability of foods in the home are important to nutritional health, and may influence the dietary behavior of children, adolescents, and adults. It is therefore important to understand food choices in the context of the household setting. Considering their importance, the measurement of household food resources becomes critical.Because most studies use a single point of data collection to determine the types of foods that are present in the home, which can miss the change in availability within a month and when resources are not available, the primary objective of this pilot study was to examine the feasibility and value of conducting weekly in-home assessments of household food resources over the course of one month among low-income Mexicano families in Texas colonias. METHODS: We conducted five in-home household food inventories over a thirty-day period in a small convenience sample; determined the frequency that food items were present in the participating households; and compared a one-time measurement with multiple measurements.After the development and pre-testing of the 252-item culturally and linguistically- appropriate household food inventory instrument that used direct observation to determine the presence and amount of food and beverage items in the home (refrigerator, freezer, pantry, elsewhere), two trained promotoras recruited a convenience sample of 6 households; administered a baseline questionnaire (personal info, shopping habits, and food security); conducted 5 in-home assessments (7-day interval) over a 30-day period; and documented grocery shopping and other food-related activities within the previous week of each in-home assessment. All data were collected in Spanish. Descriptive statistics were calculated for mean and frequency of sample characteristics, food-related activities, food security, and the presence of individual food items. Due to the small sample size of the pilot data, the Friedman Test and Kendall's W were used to assess the consistency of household food supplies across multiple observations. RESULTS: Complete data were collected from all 6 Mexicano women (33.2y +/- 3.3; 6.5 +/- 1.5 adults/children in household (HH); 5 HH received weekly income; and all were food insecure. All households purchased groceries within a week of at least four of the five assessments. The weekly presence and amounts of fresh and processed fruits and vegetables, dairy, meats, breads, cereals, beverages, and oils and fats varied. Further, the results revealed the inadequacy of a one-time measurement of household food resources, compared with multiple measures. The first household food inventory as a one-time measure would have mistakenly identified at least one-half of the participant households without fresh fruit, canned vegetables, dairy, protein foods, grains, chips, and sugar-sweetened beverages. CONCLUSIONS: This study highlights the value of documenting weekly household food supplies, especially in households where income resources may be more volatile. Clearly, the data show that a single HFI may miss the changes in availability--presence and amount--that occur among low-income Mexicano households who face challenges that require frequent purchase of foods and beverages. Use of multiple household food inventories can inform the development and implementation of nutrition-related policies and culturally sensitive nutrition education programs

    RNAcontext: A New Method for Learning the Sequence and Structure Binding Preferences of RNA-Binding Proteins

    Get PDF
    Metazoan genomes encode hundreds of RNA-binding proteins (RBPs). These proteins regulate post-transcriptional gene expression and have critical roles in numerous cellular processes including mRNA splicing, export, stability and translation. Despite their ubiquity and importance, the binding preferences for most RBPs are not well characterized. In vitro and in vivo studies, using affinity selection-based approaches, have successfully identified RNA sequence associated with specific RBPs; however, it is difficult to infer RBP sequence and structural preferences without specifically designed motif finding methods. In this study, we introduce a new motif-finding method, RNAcontext, designed to elucidate RBP-specific sequence and structural preferences with greater accuracy than existing approaches. We evaluated RNAcontext on recently published in vitro and in vivo RNA affinity selected data and demonstrate that RNAcontext identifies known binding preferences for several control proteins including HuR, PTB, and Vts1p and predicts new RNA structure preferences for SF2/ASF, RBM4, FUSIP1 and SLM2. The predicted preferences for SF2/ASF are consistent with its recently reported in vivo binding sites. RNAcontext is an accurate and efficient motif finding method ideally suited for using large-scale RNA-binding affinity datasets to determine the relative binding preferences of RBPs for a wide range of RNA sequences and structures

    A Primer on Regression Methods for Decoding cis-Regulatory Logic

    Get PDF
    The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain, they remain largely obscure to the broader biology community. The purpose of this tutorial is to bridge this gap. We will focus on transcriptional regulation to introduce the concepts. However, these techniques may be applied to other regulatory processes. We will consider only eukaryotes in this tutorial
    corecore