346 research outputs found
Protein engineering expands the effector recognition profile of a rice NLR immune receptor
Plant nucleotide binding, leucine-rich repeat (NLR) receptors detect pathogen effectors and initiate an immune response. Since their discovery, NLRs have been the focus of protein engineering to improve disease resistance. However, this approach has proven challenging, in part due to their narrow response specificity. Previously, we revealed the structural basis of pathogen recognition by the integrated heavy metal associated (HMA) domain of the rice NLR Pikp (Maqbool et al., 2015). Here, we used structure-guided engineering to expand the response profile of Pikp to variants of the rice blast pathogen effector AVR-Pik. A mutation located within an effector-binding interface of the integrated Pikp-HMA domain increased the binding affinity for AVR-Pik variants in vitro and in vivo. This translates to an expanded cell-death response to AVR-Pik variants previously unrecognized by Pikp in planta. The structures of the engineered Pikp-HMA in complex with AVR-Pik variants revealed the mechanism of expanded recognition. These results provide a proof-of-concept that protein engineering can improve the utility of plant NLR receptors where direct interaction between effectors and NLRs is established, particularly where this interaction occurs via integrated domains
Using 2k + 2 bubble searches to find single nucleotide polymorphisms in k-mer graphs
Motivation: Single nucleotide polymorphism (SNP) discovery is an important preliminary for understanding genetic variation. With current sequencing methods, we can sample genomes comprehensively. SNPs are found by aligning sequence reads against longer assembled references. De Bruijn graphs are efficient data structures that can deal with the vast amount of data from modern technologies. Recent work has shown that the topology of these graphs captures enough information to allow the detection and characterization of genetic variants, offering an alternative to alignment-based methods. Such methods rely on depth-first walks of the graph to identify closing bifurcations. These methods are conservative or generate many false-positive results, particularly when traversing highly inter-connected (complex) regions of the graph or in regions of very high coverage. Results: We devised an algorithm that calls SNPs in converted De Bruijn graphs by enumerating 2k + 2 cycles. We evaluated the accuracy of predicted SNPs by comparison with SNP lists from alignment-based methods. We tested accuracy of the SNP calling using sequence data from 16 ecotypes of Arabidopsis thaliana and found that accuracy was high. We found that SNP calling was even across the genome and genomic feature types. Using sequence-based attributes of the graph to train a decision tree allowed us to increase accuracy of SNP calls further. Together these results indicate that our algorithm is capable of finding SNPs accurately in complex sub-graphs and potentially comprehensively from whole genome graphs
Improved K-mer Based Prediction of Protein-Protein Interactions With Chaos Game Representation, Deep Learning and Reduced Representation Bias
Protein-protein interactions drive many biological processes, including the
detection of phytopathogens by plants' R-Proteins and cell surface receptors.
Many machine learning studies have attempted to predict protein-protein
interactions but performance is highly dependent on training data; models have
been shown to accurately predict interactions when the proteins involved are
included in the training data, but achieve consistently poorer results when
applied to previously unseen proteins. In addition, models that are trained
using proteins that take part in multiple interactions can suffer from
representation bias, where predictions are driven not by learned biological
features but by learning of the structure of the interaction dataset.
We present a method for extracting unique pairs from an interaction dataset,
generating non-redundant paired data for unbiased machine learning. After
applying the method to datasets containing _Arabidopsis thaliana_ and pathogen
effector interations, we developed a convolutional neural network model capable
of learning and predicting interactions from Chaos Game Representations of
proteins' coding genes
Crowdsourcing genomic analyses of ash and ash dieback – power to the people
Ash dieback is a devastating fungal disease of ash trees that has swept across Europe and recently reached the UK. This emergent pathogen has received little study in the past and its effect threatens to overwhelm the ash population. In response to this we have produced some initial genomics datasets and taken the unusual step of releasing them to the scientific community for analysis without first performing our own. In this manner we hope to ‘crowdsource’ analyses and bring the expertise of the community to bear on this problem as quickly as possible. Our data has been released through our website at oadb.tsl.ac.uk and a public GitHub repository
Accurate plant pathogen effector protein classification ab initio with deepredeff:an ensemble of convolutional neural networks
Background: Plant pathogens cause billions of dollars of crop loss every year and are a major threat to global food security. Effector proteins are the tools such pathogens use to infect the cell, predicting effectors de novo from sequence is difficult because of the heterogeneity of the sequences. We hypothesised that deep learning classifiers based on Convolutional Neural Networks would be able to identify effectors and deliver new insights. Results: We created a training set of manually curated effector sequences from PHI-Base and used these to train a range of model architectures for classifying bacteria, fungal and oomycete sequences. The best performing classifiers had accuracies from 93 to 84%. The models were tested against popular effector detection software on our own test data and data provided with those models. We observed better performance from our models. Specifically our models showed greater accuracy and lower tendencies to call false positives on a secreted protein negative test set and a greater generalisability. We used GRAD-CAM activation map analysis to identify the sequences that activated our CNN-LSTM models and found short but distinct N-terminal regions in each taxon that was indicative of effector sequences. No motifs could be observed in these regions but an analysis of amino acid types indicated differing patterns of enrichment and depletion that varied between taxa. Conclusions: Small training sets can be used effectively to train highly accurate and sensitive deep learning models without need for the operator to know anything other than sequence and without arbitrary decisions made about what sequence features or physico-chemical properties are important. Biological insight on subsequences important for classification can be achieved by examining the activations in the model
Andragogy in the 21st century: Applying the Assumptions of Adult Learning Online
Regardless of whether their motivation is intrinsic or extrinsic, adults undertakea course of learning with much more sophisticated needs and expectations thanyounger learners, and this will strongly influence their persistence. The sixassumptions of Knowles’ Andragogical Model provide insight into this psychomotivationalcocktail that we will use to make practical recommendations forinstructors about how to fully activate adults’ imperative to articulate andaccomplish their online educational goals—an essential variable toward theirsuccess. Given an attrition rate of up to 80% for some online learning contexts,it is vital that the educational approach of instructional design for online learningaligns with the learning objectives that correspond to learners’ real-world needs.If educational technology is to live up to the promise of enhancing onlinelearning outcomes, a different paradigm for instructional design and delivery ofcontent is needed. This paper will provide guidelines and techniques forincorporating adult learning principles into the structure, delivery, andmentoring/administration of online courses of study
Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects
Funding: R.A.F. was funded by the Natural Environment Research Council (NERC). D.A.H. and M.C.F. were supported by the Wellcome Trust. No additional external funding received for this study.Peer reviewedPublisher PD
Morphological and Physiological Changes of Brassica oleracea Acephala Group Seedlings as Affected by Ion and Salt Stress
The aim of this study was to determine the effect of salt stress on morphological and physiological changes of Brassica oleracea acephala group seedlings. Seedlings of kale cultivar Red Russian (RR) and collard Croatian population Konavle 2 (K2) were grown in a floating hydroponic system in Tifton, Georgia, USA. Seedlings were treated with seven different nutrient solutions (NS). The control NS (EC 2 dS m-1) was concentrated to achieve EC 4, 6 or 8 dS m-1. Three additional salt treatments included addition of NaCl solution to the control NS to get: EC 4 NaCl (2 NS + 2 NaCl), EC 6 NaCl (2 NS + 4 NaCl) and EC 8 NaCl (2 NS + 6 NaCl) dSm-1. Leaf gas exchange parameters decreased with increased EC. Seedlings treated with EC 6 NaCl and 8 NaCl dS m-1 had the lowest leaf relative water content (less than 59%). Seedlings treated with 2 dS m-1 had the greatest (187 cm2) leaf area (LA). Cultivar RR had greater LA (131 cm2) than population K2 (84 cm2). Increased percentage of shoot (14.1%) and root (10.4%) dry weight (DW) was recorded in seedlings treated with EC 8 dS m-1, c. Population K2 had higher shoot (10.9%) and root (10.4%) DW percentage compared with cv. RR. In conclusion, the nutrient solution of EC 4 NaCl had negative effect on morphological characteristics, compared to the same solution without NaCl. Increased concentrations of NS significantly affected the leaf thickness (SLA) of B. oleracea acephala group seedlings. This can be used as production tool for seedlings hardening
- …
