1,132 research outputs found
Low-Bandwidth and Non-Compute Intensive Remote Identification of Microbes from Raw Sequencing Reads
Cheap high-throughput DNA sequencing may soon become routine not only for
human genomes but also for practically anything requiring the identification of
living organisms from their DNA: tracking of infectious agents, control of food
products, bioreactors, or environmental samples.
We propose a novel general approach to the analysis of sequencing data in
which the reference genome does not have to be specified. Using a distributed
architecture we are able to query a remote server for hints about what the
reference might be, transferring a relatively small amount of data, and the
hints can be used for more computationally-demanding work.
Our system consists of a server with known reference DNA indexed, and a
client with raw sequencing reads. The client sends a sample of unidentified
reads, and in return receives a list of matching references known to the
server. Sequences for the references can be retrieved and used for exhaustive
computation on the reads, such as alignment.
To demonstrate this approach we have implemented a web server, indexing tens
of thousands of publicly available genomes and genomic regions from various
organisms and returning lists of matching hits from query sequencing reads. We
have also implemented two clients, one of them running in a web browser, in
order to demonstrate that gigabytes of raw sequencing reads of unknown origin
could be identified without the need to transfer a very large volume of data,
and on modestly powered computing devices.
A web access is available at http://tapir.cbs.dtu.dk. The source code for a
python command-line client, a server, and supplementary data is available at
http://bit.ly/1aURxkc
Insights into enterotoxigenic Escherichia coli diversity in Bangladesh utilizing genomic epidemiology
Detection of a novel gammaherpesvirus (genus Rhadinovirus) in wild muntjac deer in Northern Ireland
This study represents the initial part of an investigation into the potential for non-native, wild, free-living muntjac deer (Muntiacus reevesi) to carry viruses that could be a threat to livestock. A degenerate PCR assay was used to screen a range of tissues from muntjac deer culled in Northern Ireland for the presence of herpesviral nucleic acids. This was followed by sequencing of PCR amplicons and phylogenetic analysis. We report the detection of a novel gammaherpesvirus most closely related to a type 2 ruminant rhadinovirus from mule deer. It remains to be determined if this new virus is pathogenic to deer or presents a risk to food security through the susceptibility of domestic livestock
A transcriptomic snapshot of early molecular communication between Pasteuria penetrans and Meloidogyne incognita
© The Author(s). 2018Background: Southern root-knot nematode Meloidogyne incognita (Kofoid and White, 1919), Chitwood, 1949 is a key pest of agricultural crops. Pasteuria penetrans is a hyperparasitic bacterium capable of suppressing the nematode reproduction, and represents a typical coevolved pathogen-hyperparasite system. Attachment of Pasteuria endospores to the cuticle of second-stage nematode juveniles is the first and pivotal step in the bacterial infection. RNA-Seq was used to understand the early transcriptional response of the root-knot nematode at 8 h post Pasteuria endospore attachment. Results: A total of 52,485 transcripts were assembled from the high quality (HQ) reads, out of which 582 transcripts were found differentially expressed in the Pasteuria endospore encumbered J2 s, of which 229 were up-regulated and 353 were down-regulated. Pasteuria infection caused a suppression of the protein synthesis machinery of the nematode. Several of the differentially expressed transcripts were putatively involved in nematode innate immunity, signaling, stress responses, endospore attachment process and post-attachment behavioral modification of the juveniles. The expression profiles of fifteen selected transcripts were validated to be true by the qRT PCR. RNAi based silencing of transcripts coding for fructose bisphosphate aldolase and glucosyl transferase caused a reduction in endospore attachment as compared to the controls, whereas, silencing of aspartic protease and ubiquitin coding transcripts resulted in higher incidence of endospore attachment on the nematode cuticle. Conclusions: Here we provide evidence of an early transcriptional response by the nematode upon infection by Pasteuria prior to root invasion. We found that adhesion of Pasteuria endospores to the cuticle induced a down-regulated protein response in the nematode. In addition, we show that fructose bisphosphate aldolase, glucosyl transferase, aspartic protease and ubiquitin coding transcripts are involved in modulating the endospore attachment on the nematode cuticle. Our results add new and significant information to the existing knowledge on early molecular interaction between M. incognita and P. penetrans.Peer reviewedFinal Published versio
AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees
A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php
Recommended from our members
Missed, not missing: Phylogenomic evidence for the existence of Avian FoxP3
The Forkhead box transcription factor FoxP3 is pivotal to the development and function of regulatory T cells (Tregs), which make a major contribution to peripheral tolerance. FoxP3 is believed to perform a regulatory role in all the vertebrate species in which it has been detected. The prevailing view is that FoxP3 is absent in birds and that avian Tregs rely on alternative developmental and suppressive pathways. Prompted by the automated annotation of foxp3 in the ground tit (Parus humilis) genome, we have questioned this assumption. Our analysis of all available avian genomes has revealed that the foxp3 locus is missing, incomplete or of poor quality in the relevant genomic assemblies for nearly all avian species. Nevertheless, in two species, the peregrine falcon (Falco peregrinus) and the saker falcon (F. cherrug), there is compelling evidence for the existence of exons showing synteny with foxp3 in the ground tit. A broader phylogenomic analysis has shown that FoxP3 sequences from these three species are similar to crocodilian sequences, the closest living relatives of birds. In both birds and crocodilians, we have also identified a highly proline-enriched region at the N terminus of FoxP3, a region previously identified only in mammals
To hit or not to hit, that is the question -genome-wide structure-based druggability predictions for <i>pseudomonas aeruginosa </i>proteins
Pseudomonas aeruginosa is a Gram-negative bacterium known to cause opportunistic infections in immune-compromised or immunosuppressed individuals that often prove fatal. New drugs to combat this organism are therefore sought after. To this end, we subjected the gene products of predicted perturbative genes to structure-based druggability predictions using DrugPred. Making this approach suitable for large-scale predictions required the introduction of new methods for calculation of descriptors, development of a workflow to identify suitable pockets in homologous proteins and establishment of criteria to obtain valid druggability predictions based on homologs. We were able to identify 29 perturbative proteins of P. aeruginosa that may contain druggable pockets, including some of them with no or no drug-like inhibitors deposited in ChEMBL. These proteins form promising novel targets for drug discovery against P. aeruginosa
The Barley Genome Sequence Assembly Reveals Three Additional Members of the <i>CslF </i>(1,3;1,4)-b-Glucan Synthase Gene Family
An important component of barley cell walls, particularly in the endosperm, is (1,3;1,4)-β-glucan, a polymer that has proven health benefits in humans and that influences processability in the brewing industry. Genes of the cellulose synthase-like (Csl) F gene family have been shown to be involved in (1,3;1,4)-β-glucan synthesis but many aspects of the biosynthesis are still unclear. Examination of the sequence assembly of the barley genome has revealed the presence of an additional three HvCslF genes (HvCslF11, HvCslF12 and HvCslF13) which may be involved in (1,3;1,4)-β-glucan synthesis. Transcripts of HvCslF11 and HvCslF12 mRNA were found in roots and young leaves, respectively. Transient expression of these genes in Nicotiana benthamiana resulted in phenotypic changes in the infiltrated leaves, although no authentic (1,3;1,4)-β-glucan was detected. Comparisons of the CslF gene families in cereals revealed evidence of intergenic recombination, gene duplications and translocation events. This significant divergence within the gene family might be related to multiple functions of (1,3;1,4)-β-glucans in the Poaceae. Emerging genomic and global expression data for barley and other cereals is a powerful resource for characterising the evolution and dynamics of complete gene families. In the case of the CslF gene family, the results will contribute to a more thorough understanding of carbohydrate metabolism in grass cell walls
A membrane-inserted structural model of the yeast mitofusin Fzo1
Mitofusins are large transmembrane GTPases of the dynamin-related protein family, and are required for the tethering and fusion of mitochondrial outer membranes. Their full-length structures remain unknown, which is a limiting factor in the study of outer membrane fusion. We investigated the structure and dynamics of the yeast mitofusin Fzo1 through a hybrid computational and experimental approach, combining molecular modelling and all-atom molecular dynamics simulations in a lipid bilayer with site-directed mutagenesis and in vivo functional assays. The predicted architecture of Fzo1 improves upon the current domain annotation, with a precise description of the helical spans linked by flexible hinges, which are likely of functional significance. In vivo site-directed mutagenesis validates salient aspects of this model, notably, the long-distance contacts and residues participating in hinges. GDP is predicted to interact with Fzo1 through the G1 and G4 motifs of the GTPase domain. The model reveals structural determinants critical for protein function, including regions that may be involved in GTPase domain-dependent rearrangements
Recommended from our members
Arabinose and protocatechuate catabolism genes are important for growth of Rhizobium leguminosarum biovar viciae in the pea rhizosphere
Background and aims: To form nitrogen-fixing nodules on pea roots, Rhizobium leguminosarum biovar viciae must be competitive in the rhizosphere. Our aim was to identify genes important for rhizosphere fitness.
Methods: Signature-tagged mutants were screened using microarrays to identify mutants reduced for growth in pea rhizospheres. Candidate mutants were assessed relative to controls for growth in minimal medium, growth in pea rhizospheres and for infection of peas in mixed inoculants. Mutated genes were identified by DNA sequencing and confirmed by transduction.
Results: Of 5508 signature-tagged mutants, microarrays implicated 50 as having decreased rhizosphere fitness. Growth tests identified six mutants with rhizosphere-specific phenotypes. The mutation in one of the genes (araE) was in an arabinose catabolism operon and blocked growth on arabinose. The mutation in another gene (pcaM), encoding a predicted solute binding protein for protocatechuate and hydroxybenzoate uptake, decreased growth on protocatechuate. Both mutants were decreased for nodule infection competitiveness with mixed inoculants, but nodulated peas normally when inoculated alone. Other mutants with similar phenotypes had mutations predicted to affect secondary metabolism.
Conclusions: Catabolism of arabinose and protocatechuate in the pea rhizosphere is important for competitiveness of R.l. viciae. Other genes predicted to be involved in secondary metabolism are also important
- …
