34 research outputs found
How well do RNA-Seq differential gene expression tools perform in a complex eukaryote? A case study in Arabidopsis thaliana
RNA-seq experiments are usually carried out in three or fewer replicates. In order to work well with so few samples, differential gene expression (DGE) tools typically assume the form of the underlying gene expression distribution. In this paper, the statistical properties of gene expression from RNA-seq are investigated in the complex eukaryote, Arabidopsis thaliana, extending and generalizing the results of previous work in the simple eukaryote Saccharomyces cerevisiae. Results: We show that, consistent with the results in S.cerevisiae, more gene expression measurements in A.thaliana are consistent with being drawn from an underlying negative binomial distribution than either a log-normal distribution or a normal distribution, and that the size and complexity of the A.thaliana transcriptome does not influence the false positive rate performance of nine widely used DGE tools tested here. We therefore recommend the use of DGE tools that are based on the negative binomial distribution. Availability and implementation: The raw data for the 17 WT Arabidopsis thaliana datasets is available from the European Nucleotide Archive (E-MTAB-5446). The processed and aligned data can be visualized in context using IGB (Freese et al., 2016), or downloaded directly, using our publicly available IGB quickload server at https://compbio.lifesci.dundee.ac.uk/arabidopsisQuickload/public-quickload/ under 'RNAseq>Froussios2019'. All scripts and commands are available from github at https://github.com/bartongroup/KF-arabidopsis-GRNA. Supplementary information: Supplementary data are available at Bioinformatics online.</p
Detection and Mitigation of Spurious Antisense RNA-seq Reads with RoSA
Motivation: Antisense transcription is known to have a range of impacts on sense gene expression, including (but not limited to) impeding transcription initiation, disrupting post-transcriptional processes, and enhancing, slowing, or even preventing transcription of the sense gene. Strand-specific RNA-Seq protocols preserve the strand information of the original RNA in the data, and so can be used to identify where antisense transcription may be implicated in regulating gene expression. However, our analysis of 199 strand-specific RNA-Seq experiments reveals that spurious antisense reads are often present in these datasets at levels greater than 1% of sense gene expression levels. Furthermore, these levels can vary substantially even between replicates in the same experiment, potentially disrupting any downstream analysis, if the incorrectly assigned antisense counts dominate the set of genes with high antisense transcription levels. Currently, no tools exist to detect or correct for this spurious antisense signal. Results: Our tool, RoSA (Removal of Spurious Antisense), detects the presence of high levels of spurious antisense read alignments in strand-specific RNA-Seq datasets. It uses incorrectly spliced reads on the antisense strand and/or ERCC spike-ins (if present in the data) to calculate both global and gene-specific antisense correction factors. We demonstrate the utility of our tool to filter out spurious antisense transcript counts in an Arabidopsis thaliana RNA-Seq experiment
Relative Abundance of Transcripts (RATs):Identifying differential isoform abundance from RNA-seq [version 1; referees: 1 approved, 2 approved with reservations]
The biological importance of changes in RNA expression is reflected by the wide variety of tools available to characterise these changes from RNA-seq data. Several tools exist for detecting differential transcript isoform usage (DTU) from aligned or assembled RNA-seq data, but few exist for DTU detection from alignment-free RNA-seq quantifications. We present the RATs, an R package that identifies DTU transcriptome-wide directly from transcript abundance estimates. RATs is unique in applying bootstrapping to estimate the reliability of detected DTU events and shows good performance at all replication levels (median false positive fraction < 0.05). We compare RATs to two existing DTU tools, DRIM-Seq & SUPPA2, using two publicly available simulated RNA-seq datasets and a published human RNA-seq dataset, in which 248 genes have been previously identified as displaying significant DTU. RATs with default threshold values on the simulated Human data has a sensitivity of 0.55, a Matthews correlation coefficient of 0.71 and a false discovery rate (FDR) of 0.04, outperforming both other tools. Applying the same thresholds for SUPPA2 results in a higher sensitivity (0.61) but poorer FDR performance (0.33). RATs and DRIM-seq use different methods for measuring DTU effect-sizes complicating the comparison of results between these tools, however, for a likelihood-ratio threshold of 30, DRIM-Seq has similar FDR performance to RATs (0.06), but worse sensitivity (0.47). These differences persist for the simulated drosophila dataset. On the published human RNA-seq dataset the greatest agreement between the tools tested is 53%, observed between RATs and SUPPA2. The bootstrapping quality filter in RATs is responsible for removing the majority of DTU events called by SUPPA2 that are not reported by RATs. All methods, including the previously published qRT-PCR of three of the 248 detected DTU events, were found to be sensitive to annotation differences between Ensembl v60 and v87
Regulation of somatic hypermutation by higher-order chromatin structure
The generation of protective antibodies by somatic hypermutation (SHM) is essential for antibody maturation and adaptive immunity. SHM involves co-transcriptional mutagenesis of immunoglobulin variable (V) regions regulated by enhancers located hundreds of kilobases away. How 3D chromatin topology affects SHM is poorly understood. Here, we measure higher-order interactions on single alleles of the human immunoglobulin heavy-chain locus (IGH) using Tri-C. We find that SHM is underpinned by a multiway hub wherein the V region is proximal to all enhancers. Cohesin-mediated loop extrusion is dispensable for IGH transcription and hub architecture. Transcription and mutagenesis of IGH switch regions, which are necessary for antibody class-switch recombination, create new chromatin loops that can form without cohesin. However, these additional loops do not compromise hub integrity, V region transcription, or SHM. Thus, antibody maturation occurs within a multiway hub accommodating several gene-enhancer loops in which transcription and mutagenesis of different segments occur non-competitively
ChemInform Abstract: Preparation of Diphenylmethyl Esters and Ethers of Unprotected Amino Acids and β-Hydroxy-α-amino Acids.
ChemInform Abstract: A New Method for the Protection of the Carboxy Groups in α-Amino Acids: 9-Fluorylidene Esters.
O-diphenylmethylation of alcohols and carboxylic acids using diphenylmethyl diphenyl phosphate as alkylating agent
Diphenylmethyl diphenyl phosphate reacts quickly under mild conditions with various alcohols and carboxylic acids towards diphenylmethyl ethers and esters respectively while hydroxyacids can be selectively alkylated at the alcohol site. © 1984
Facile synthesis of 1-adamantyl esters of L-α-amino acids, a new class of carboxy protected derivatives
1-Adamantyl esters of several N-unprotected L-α-amino acids were directly prepared in good optical purity and yield by reaction of the corresponding amino acid 4-toluenesulfonate salts with 1-adamantanol (AdOH) and dimethyl sulfite in boiling toluene. The fully protected tripeptide Boc-Leu-Ala-Val-OAd, prepared from TsOH.H-Val-OAd (entry 2b) was amino deprotected to H-Leu-Ala-Val-OAd by the action of 4N HCl in dioxane for 25 minutes at 20°C, while the latter was carboxy deprotected to the free peptide by the action of trifluoroacetic acid for 60 minutes at 20°C. The 1-adamantyl ether of threonine (5) was also prepared and the 1-adamantyl moiety was completely cleaved from Troc-Thr(OAd)-OH by the action of trifluoroacetic acid for 30 minutes at 20°C
