239 research outputs found
Optimal Computation of Overabundant Words
The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word w in a given sequence x can be used for classifying w as avoided or overabundant. The definitions used for the expectation and deviation of w in this statistical model were described and biologically justified by Brendel et al. (J Biomol Struct Dyn 1986). We have very recently introduced a time-optimal algorithm for computing all avoided words of a given sequence over an integer alphabet (Algorithms Mol Biol 2017). In this article, we extend this study by presenting an O(n)-time and O(n)-space algorithm for computing all overabundant words in a sequence x of length n over an integer alphabet. Our main result is based on a new non-trivial combinatorial property of the suffix tree T of x: the number of distinct factors of x whose longest infix is the label of an explicit node of T is no more than 3n-4. We further show that the presented algorithm is time-optimal by proving that O(n) is a tight upper bound for the number of overabundant words. Finally, we present experimental results, using both synthetic and real data, which justify the effectiveness and efficiency of our approach in practical terms
Stroke genetics: prospects for personalized medicine.
Epidemiologic evidence supports a genetic predisposition to stroke. Recent advances, primarily using the genome-wide association study approach, are transforming what we know about the genetics of multifactorial stroke, and are identifying novel stroke genes. The current findings are consistent with different stroke subtypes having different genetic architecture. These discoveries may identify novel pathways involved in stroke pathogenesis, and suggest new treatment approaches. However, the already identified genetic variants explain only a small proportion of overall stroke risk, and therefore are not currently useful in predicting risk for the individual patient. Such risk prediction may become a reality as identification of a greater number of stroke risk variants that explain the majority of genetic risk proceeds, and perhaps when information on rare variants, identified by whole-genome sequencing, is also incorporated into risk algorithms. Pharmacogenomics may offer the potential for earlier implementation of 'personalized genetic' medicine. Genetic variants affecting clopidogrel and warfarin metabolism may identify non-responders and reduce side-effects, but these approaches have not yet been widely adopted in clinical practice
CNEFinder: Finding conserved non-coding elements in genomes
Availability and implementation:
Free software under the terms of the GNU GPL (https://github.com/lorrainea/CNEFinder).Motivation:
Conserved non-coding elements (CNEs) represent an enigmatic class of genomic elements which, despite being extremely conserved across evolution, do not encode for proteins. Their functions are still largely unknown. Thus, there exists a need to systematically investigate their roles in genomes. Towards this direction, identifying sets of CNEs in a wide range of organisms is an important first step. Currently, there are no tools published in the literature for systematically identifying CNEs in genomes.
Results
We fill this gap by presenting CNEFinder; a tool for identifying CNEs between two given DNA sequences with user-defined criteria. The results presented here show the tool’s ability of identifying CNEs accurately and efficiently. CNEFinder is based on a k-mer technique for computing maximal exact matches. The tool thus does not require or compute whole-genome alignments or indexes, such as the suffix array or the Burrows Wheeler Transform (BWT), which makes it flexible to use on a wide scale.This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/M50788X/1]
On overabundant words and their application to biological sequence analysis
The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word w in a given sequence x can be used for classifying w as avoided or overabundant. The definitions used for the expectation and deviation of w in this statistical model were described and biologically justified by Brendel et al. (J Biomol Struct Dyn 1986, [1]). We have very recently introduced a time-optimal algorithm for computing all avoided words of a given sequence over an integer alphabet (Algorithms Mol Biol 2017, [2]). In this article, we extend this study by presenting an O(n)-time and O(n)-space algorithm for computing all overabundant words in a sequence x of length n over an integer alphabet. Our main result is based on a new non-trivial combinatorial property of the suffix tree T of x: the number of distinct factors of x whose longest infix is the label of an explicit node of T is no more than 3n−4. We further show that the presented algorithm is time-optimal by proving that O(n) is a tight upper bound for the number of overabundant words. Finally, we present experimental results, using both synthetic and real data, which justify the effectiveness and efficiency of our approach in practical terms
Optimal Computation of Avoided Words
The deviation of the observed frequency of a word w from its expected frequency in a given sequence x is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of w, denoted by std(w), effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word w of length k>2 is a ρ-avoided word in x if std(w)≤ρ, for a given threshold ρ<0. Notice that such a word may be completely absent from x. Hence computing all such words naïvely can be a very time-consuming procedure, in particular for large k. In this article, we propose an O(n)-time and O(n)-space algorithm to compute all ρ-avoided words of length k in a given sequence x of length n over a fixed-sized alphabet. We also present a time-optimal O(σn)-time algorithm to compute all ρ-avoided words (of any length) in a sequence of length n over an integer alphabet of size σ. We provide a tight asymptotic upper bound for the number of ρ-avoided words over an integer alphabet and the expected length of the longest one. We make available an implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation
Herbal substance, acteoside, alleviates intestinal mucositis in mice
This study investigated the role of acteoside in the amelioration of mucositis. C57BL/6 mice were gavaged daily with acteoside 600 μg for 5 d prior to induction of mucositis and throughout the experimental period. Mucositis was induced by methotrexate (MTX; 12.5 mg/kg; s.c.). Mice were culled on d 5 and d 11 after MTX. The duodenum, jejunum, and ileum were collected for myeloperoxidase (MPO) activity, metallothionein (MT) levels, and histology. Acteoside reduced histological severity scores by 75, 78, and 88% in the duodenum, jejunum, and ileum, respectively, compared to MTX-controls on d 5. Acteoside reduced crypt depth by 49, 51, and 33% and increased villus height by 19, 38, and 10% in the duodenum, jejunum, and ileum, respectively, compared to MTX-controls on d 5. Acteoside decreased MT by 50% compared to MTX-control mice on d 5. Acteoside decreased MPO by 60% and 30% in the duodenum and jejunum, respectively, compared to MTX-controls on d 5. Acteoside alleviated MTX-induced small intestinal mucositis possibly by preventing inflammation.Daniel Reinke, Stamatiki Kritas, Panagiotis Polychronopoulos, Alexios L. Skaltsounis, Nektarios Aligiannis, and Cuong D. Tra
Unravelling the Genetics of Ischaemic Stroke
Hugh Markus discusses genetic factors in stroke risk, and emphasizes the importance of large sample studies and rigorous replication of results in genetic stroke research
The effect of stimulation technique on sympathetic skin responses in healthy subjects
The aim of this study was to collect normative data for sympathetic skin responses (SSR) elicited by electrical stimulus of the ipsilateral and contralateral peripheral nerves, and by magnetic stimulus of cervical cord. SSRs were measured at the mid-palm of both hands following electrical stimulation of the left median nerve at the wrist and magnetic stimulation at the neck in 40 healthy adult volunteers (mean age 52.2 ± 12.2 years, 19 males). The onset latency, peak latency, amplitude and area were estimated in “P” type responses (i.e., waveforms with a larger positive, compared to negative, component). SSR onset and peak latency were prolonged when the electrical stimulus was applied at the contralateral side (i.e., the SSR recorded in the right palm P < 0.001). The onset latency was similar on both sides during cervical magnetic stimulation. However, peak latency was faster on the left side (P < 0.03). Comparison of electrical and magnetic stimulation revealed that both the onset and peak latency were shorter with magnetic stimulation (P < 0.001). The latency of a SSR varies depending on what type of stimulation is used and where the stimulus is applied. Electrically generated SSRs have a longer delay and the delay is prolonged at the contralateral side. These factors should be taken into account when interpreting SSR data
Wnt/β-catenin controls follistatin signalling to regulate satellite cell myogenic potential
BACKGROUND: Adult skeletal muscle regeneration is a highly orchestrated process involving the activation and proliferation of satellite cells, an adult skeletal muscle stem cell. Activated satellite cells generate a transient amplifying progenitor pool of myoblasts that commit to differentiation and fuse into multinucleated myotubes. During regeneration, canonical Wnt signalling is activated and has been implicated in regulating myogenic lineage progression and terminal differentiation. METHODS: Here, we have undertaken a gene expression analysis of committed satellite cell-derived myoblasts to examine their ability to respond to canonical Wnt/β-catenin signalling. RESULTS: We found that activation of canonical Wnt signalling induces follistatin expression in myoblasts and promotes myoblast fusion in a follistatin-dependent manner. In growth conditions, canonical Wnt/β-catenin signalling prime myoblasts for myogenic differentiation by stimulating myogenin and follistatin expression. We further found that myogenin binds elements in the follistatin promoter and thus acts downstream of myogenin during differentiation. Finally, ectopic activation of canonical Wnt signalling in vivo promoted premature differentiation during muscle regeneration following acute injury. CONCLUSIONS: Together, these data reveal a novel mechanism by which myogenin mediates the canonical Wnt/β-catenin-dependent activation of follistatin and induction of the myogenic differentiation process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13395-015-0038-6) contains supplementary material, which is available to authorized users
Assessing the digenic model in rare disorders using population sequencing data
An important fraction of patients with rare disorders remains with no clear genetic diagnostic, even after whole-exome or whole-genome sequencing, posing a difficulty in giving adequate treatment and genetic counseling. The analysis of genomic data in rare disorders mostly considers the presence of single gene variants in coding regions that follow a concrete monogenic mode of inheritance. A digenic inheritance, with variants in two functionally-related genes in the same individual, is a plausible alternative that might explain the genetic basis of the disease in some cases. In this case, digenic disease combinations should be absent or underrepresented in healthy individuals. We develop a framework to evaluate the significance of digenic combinations and test its statistical power in different scenarios. We suggest that this approach will be relevant with the advent of new sequencing efforts including hundreds of thousands of samples
- …
