97 research outputs found
Untranslated regions of mRNAs
Gene expression is finely regulated at the post-transcriptional level. Features of the untranslated regions of mRNAs that control their translation, degradation and localization include stem-loop structures, upstream initiation codons and open reading frames, internal ribosome entry sites and various cis-acting elements that are bound by RNA-binding proteins
Regularized Least Squares Cancer Classifiers from DNA microarray data
BACKGROUND: The advent of the technology of DNA microarrays constitutes an epochal change in the classification and discovery of different types of cancer because the information provided by DNA microarrays allows an approach to the problem of cancer analysis from a quantitative rather than qualitative point of view. Cancer classification requires well founded mathematical methods which are able to predict the status of new specimens with high significance levels starting from a limited number of data. In this paper we assess the performances of Regularized Least Squares (RLS) classifiers, originally proposed in regularization theory, by comparing them with Support Vector Machines (SVM), the state-of-the-art supervised learning technique for cancer classification by DNA microarray data. The performances of both approaches have been also investigated with respect to the number of selected genes and different gene selection strategies. RESULTS: We show that RLS classifiers have performances comparable to those of SVM classifiers as the Leave-One-Out (LOO) error evaluated on three different data sets shows. The main advantage of RLS machines is that for solving a classification problem they use a linear system of order equal to either the number of features or the number of training examples. Moreover, RLS machines allow to get an exact measure of the LOO error with just one training. CONCLUSION: RLS classifiers are a valuable alternative to SVM classifiers for the problem of cancer classification by gene expression data, due to their simplicity and low computational complexity. Moreover, RLS classifiers show generalization ability comparable to the ones of SVM classifiers also in the case the classification of new specimens involves very few gene expression levels
A fuzzy method for RNA-Seq differential expression analysis in presence of multireads
Background: When the reads obtained from high-throughput RNA sequencing are mapped against a reference database, a significant proportion of them - known as multireads - can map to more than one reference sequence. These multireads originate from gene duplications, repetitive regions or overlapping genes. Removing the multireads from the mapping results, in RNA-Seq analyses, causes an underestimation of the read counts, while estimating the real read count can lead to false positives during the detection of differentially expressed sequences. Results: We present an innovative approach to deal with multireads and evaluate differential expression events, entirely based on fuzzy set theory. Since multireads cause uncertainty in the estimation of read counts during gene expression computation, they can also influence the reliability of differential expression analysis results, by producing false positives. Our method manages the uncertainty in gene expression estimation by defining the fuzzy read counts and evaluates the possibility of a gene to be differentially expressed with three fuzzy concepts: over-expression, same-expression and under-expression. The output of the method is a list of differentially expressed genes enriched with information about the uncertainty of the results due to the multiread presence. We have tested the method on RNA-Seq data designed for case-control studies and we have compared the obtained results with other existing tools for read count estimation and differential expression analysis. Conclusions: The management of multireads with the use of fuzzy sets allows to obtain a list of differential expression events which takes in account the uncertainty in the results caused by the presence of multireads. Such additional information can be used by the biologists when they have to select the most relevant differential expression events to validate with laboratory assays. Our method can be used to compute reliable differential expression events and to highlight possible false positives in the lists of differentially expressed genes computed with other tools
A Pilot Longitudinal Evaluation of MicroRNAs for Monitoring the Cognitive Impairment in Pediatric Multiple Sclerosis
MicroRNAs (miRNAs), a class of non-coding RNAs, seem to play a key role in complex diseases like multiple sclerosis (MS), as well as in many cognitive functions associated with the disease. In a previous cross-sectional evaluation on pediatric MS (PedMS) patients, the expression of some miRNAs and their target genes were found to be associated with the scores of some neuropsychiatric tests, thus suggesting that they may be involved in early processes of cognitive impairment. To verify these data, we asked the same patients to be re-evaluated after a 1-year interval; unfortunately, only nine of them agreed to this further clinical and molecular analysis. The main results showed that 13 differentially expressed miRNAs discriminated the two time-points. Among them, the expression of miR-182-5p, miR-320a-3p, miR-744-5p and miR-192-5p significantly correlated with the attention and information processing speed performances, whereas the expression of miR-182-5p, miR-451a, miR-4742-3p and miR-320a-3p correlated with the expressive language performances. The analysis of mRNA expression uncovered 58 predicted and/or validated miRNA-target pairs, including 23 target genes, some of them already associated with cognitive impairment, such as the transducing beta like 1 X-linked receptor-1 gene (TBL1XR1), correlated to disorders of neurodevelopment; the Snf2 related CREBBP activator protein gene (SRCAP) that was found implicated in a rare form of dementia; and the glia maturation factor beta gene (GMFB), which has been reported to be implicated in neurodegeneration and neuroinflammation. No molecular pathways involving the most targeted genes survived the adjustment for multiple data. Although preliminary, these findings showed the feasibility of the methods also applied to longitudinal investigations, as well as the reliability of the obtained results. These findings should be confirmed in larger PedMS cohorts in order to identify early markers of cognitive impairment, towards which more efficient therapeutic efforts can be addressed
WoPPER: Web server for Position Related data analysis of gene Expression in Prokaryotes
The structural and conformational organization of chromosomes is crucial for gene expression regulation in eukaryotes and prokaryotes as well. Up to date, gene expression data generated using either microarray or RNA-sequencing are available for many bacterial genomes. However, differential gene expression is usually investigated with methods considering each gene independently, thus not taking into account the physical localization of genes along a bacterial chromosome. Here, we present WoPPER, a web tool integrating gene expression and genomic annotations to identify differentially expressed chromosomal regions in bacteria. RNA-sequencing or microarray-based gene expression data are provided as input, along with gene annotations. The user can select genomic annotations from an internal database including 2780 bacterial strains, or provide custom genomic annotations. The analysis produces as output the lists of positionally related genes showing a coordinated trend of differential expression. Graphical representations, including a circular plot of the analyzed chromosome, allow intuitive browsing of the results. The analysis procedure is based on our previously published R-package PREDA. The release of this tool is timely and relevant for the scientific community, as WoPPER will fill an existing gap in prokaryotic gene expression data analysis and visualization tools. WoPPER is open to all users and can be reached at the following URL: https://WoPPER.ba.itb.cnr.it
UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs
The 5′ and 3′ untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5′ and 3′ untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided. The integration of UTRdb with genomic and protein data has allowed the implementation of a powerful retrieval resource for the selection and extraction of UTR subsets based on their genomic coordinates and/or features of the protein encoded by the relevant mRNA (e.g. GO term, PFAM domain, etc.). All internet resources implemented for retrieval and functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs are accessible at http://www.ba.itb.cnr.it/UTR/
InteractomeSeq: a web server for the identification and profiling of domains and epitopes from phage display and next generation sequencing data
High-Throughput Sequencing technologies are transforming many research fields, including the analysis of phage display libraries. The phage display technology coupled with deep sequencing was introduced more than a decade ago and holds the potential to circumvent the traditional laborious picking and testing of individual phage rescued clones. However, from a bioinformatics point of view, the analysis of this kind of data was always performed by adapting tools designed for other purposes, thus not considering the noise background typical of the 'interactome sequencing' approach and the heterogeneity of the data. InteractomeSeq is a web server allowing data analysis of protein domains ('domainome') or epitopes ('epitome') from either Eukaryotic or Prokaryotic genomic phage libraries generated and selected by following an Interactome sequencing approach. InteractomeSeq allows users to upload raw sequencing data and to obtain an accurate characterization of domainome/epitome profiles after setting the parameters required to tune the analysis. The release of this tool is relevant for the scientific and clinical community, because InteractomeSeq will fill an existing gap in the field of large-scale biomarkers profiling, reverse vaccinology, and structural/functional studies, thus contributing essential information for gene annotation or antigen identification. InteractomeSeq is freely available at https://InteractomeSeq.ba.itb.cnr.it/
BEAT: Bioinformatics Exon Array Tool to store, analyze and visualize Affymetrix GeneChip Human Exon Array data from disease experiments
<p>Abstract</p> <p>Background</p> <p>It is known from recent studies that more than 90% of human multi-exon genes are subject to Alternative Splicing (AS), a key molecular mechanism in which multiple transcripts may be generated from a single gene. It is widely recognized that a breakdown in AS mechanisms plays an important role in cellular differentiation and pathologies. Polymerase Chain Reactions, microarrays and sequencing technologies have been applied to the study of transcript diversity arising from alternative expression. Last generation Affymetrix GeneChip Human Exon 1.0 ST Arrays offer a more detailed view of the gene expression profile providing information on the AS patterns. The exon array technology, with more than five million data points, can detect approximately one million exons, and it allows performing analyses at both gene and exon level. In this paper we describe BEAT, an integrated user-friendly bioinformatics framework to store, analyze and visualize exon arrays datasets. It combines a data warehouse approach with some rigorous statistical methods for assessing the AS of genes involved in diseases. Meta statistics are proposed as a novel approach to explore the analysis results. BEAT is available at <url>http://beat.ba.itb.cnr.it</url>.</p> <p>Results</p> <p>BEAT is a web tool which allows uploading and analyzing exon array datasets using standard statistical methods and an easy-to-use graphical web front-end. BEAT has been tested on a dataset with 173 samples and tuned using new datasets of exon array experiments from 28 colorectal cancer and 26 renal cell cancer samples produced at the Medical Genetics Unit of IRCCS Casa Sollievo della Sofferenza.</p> <p>To highlight all possible AS events, alternative names, accession Ids, Gene Ontology terms and biochemical pathways annotations are integrated with exon and gene level expression plots. The user can customize the results choosing custom thresholds for the statistical parameters and exploiting the available clinical data of the samples for a multivariate AS analysis.</p> <p>Conclusions</p> <p>Despite exon array chips being widely used for transcriptomics studies, there is a lack of analysis tools offering advanced statistical features and requiring no programming knowledge. BEAT provides a user-friendly platform for a comprehensive study of AS events in human diseases, displaying the analysis results with easily interpretable and interactive tables and graphics.</p
TRIM8 restores p53 tumour suppressor function by blunting N-MYC activity in chemo-resistant tumours
Background: TRIM8 plays a key role in controlling the p53 molecular switch that sustains the transcriptional activation of cell cycle arrest genes and response to chemotherapeutic drugs. The mechanisms that regulate TRIM8, especially in cancers like clear cell Renal Cell Carcinoma (ccRCC) and colorectal cancer (CRC) where it is low expressed, are still unknown. However, recent studies suggest the potential involvement of some microRNAs belonging to miR-17-92 and its paralogous clusters, which could include TRIM8 in a more complex pathway. Methods: We used RCC and CRC cell models for in-vitro experiments, and ccRCC patients and xenograft transplanted mice for in vivo assessments. To measure microRNAs levels we performed RT-qPCR, while steady-states of TRIM8, p53, p21 and N-MYC were quantified at protein level by Western Blotting as well as at transcript level by RT-qPCR. Luciferase reporter assays were performed to assess the interaction between TRIM8 and specific miRNAs, and the potential effects of this interaction on TRIM8 expression. Moreover, we treated our cell models with conventional chemotherapeutic drugs or tyrosine kinase inhibitors, and measured their response in terms of cell proliferation by MTT and colony suppression assays. Results: We showed that TRIM8 is a target of miR-17-5p and miR-106b-5p, whose expression is promoted by N-MYC, and that alterations of their levels affect cell proliferation, acting on the TRIM8 transcripts stability, as confirmed in ccRCC patients and cell lines. In addition, reducing the levels of miR-17-5p/miR-106b-5p, we increased the chemo-sensitivity of RCC/CRC-derived cells to anti-tumour drugs used in the clinic. Intriguingly, this occurs, on one hand, by recovering the p53 tumour suppressor activity in a TRIM8-dependent fashion and, on the other hand, by promoting the transcription of miR-34a that turns off the oncogenic action of N-MYC. This ultimately leads to cell proliferation reduction or block, observed also in colon cancer xenografts overexpressing TRIM8. Conclusions: In this paper we provided evidence that TRIM8 and its regulators miR-17-5p and miR-106b-5 participate to a feedback loop controlling cell proliferation through the reciprocal modulation of p53, miR-34a and N-MYC. Our experiments pointed out that this axis is pivotal in defining drug responsiveness of cancers such ccRCC and CRC
A platform independent RNA-Seq protocol for the detection of transcriptome complexity
Background: Recent studies have demonstrated an unexpected complexity of transcription in eukaryotes. The majority of the genome is transcribed and only a little fraction of these transcripts is annotated as protein coding genes and their splice variants. Indeed, most transcripts are the result of antisense, overlapping and non-coding RNA expression. In this frame, one of the key aims of high throughput transcriptome sequencing is the detection of all RNA species present in the cell and the first crucial step for RNA-seq users is represented by the choice of the strategy for cDNA library construction. The protocols developed so far provide the utilization of the entire library for a single sequencing run with a specific platform.
Results: We set up a unique protocol to generate and amplify a strand-specific cDNA library representative of all RNA species that may be implemented with all major platforms currently available on the market (Roche 454, Illumina, ABI/SOLiD). Our method is reproducible, fast, easy-to-perform and even allows to start from low input total RNA. Furthermore, we provide a suitable bioinformatics tool for the analysis of the sequences produced following this protocol.
Conclusion: We tested the efficiency of our strategy, showing that our method is platform-independent, thus allowing the simultaneous analysis of the same sample with different NGS technologies, and providing an accurate quantitative and qualitative portrait of complex whole transcriptomes
- …
