719 research outputs found
Which gene did you mean?
Computational Biology needs computer-readable information records. Increasingly, meta-analysed and pre-digested information is being used in the follow up of high throughput experiments and other investigations that yield massive data sets. Semantic enrichment of plain text is crucial for computer aided analysis. In general people will think about semantic tagging as just another form of text mining, and that term has quite a negative connotation in the minds of some biologists who have been disappointed by classical approaches of text mining. Efforts so far have tried to develop tools and technologies that retrospectively extract the correct information from text, which is usually full of ambiguities. Although remarkable results have been obtained in experimental circumstances, the wide spread use of information mining tools is lagging behind earlier expectations. This commentary proposes to make semantic tagging an integral process to electronic publishing
Broadening the Scope of Nanopublications
In this paper, we present an approach for extending the existing concept of
nanopublications --- tiny entities of scientific results in RDF representation
--- to broaden their application range. The proposed extension uses English
sentences to represent informal and underspecified scientific claims. These
sentences follow a syntactic and semantic scheme that we call AIDA (Atomic,
Independent, Declarative, Absolute), which provides a uniform and succinct
representation of scientific assertions. Such AIDA nanopublications are
compatible with the existing nanopublication concept and enjoy most of its
advantages such as information sharing, interlinking of scientific findings,
and detailed attribution, while being more flexible and applicable to a much
wider range of scientific results. We show that users are able to create AIDA
sentences for given scientific results quickly and at high quality, and that it
is feasible to automatically extract and interlink AIDA nanopublications from
existing unstructured data sources. To demonstrate our approach, a web-based
interface is introduced, which also exemplifies the use of nanopublications for
non-scientific content, including meta-nanopublications that describe other
nanopublications.Comment: To appear in the Proceedings of the 10th Extended Semantic Web
Conference (ESWC 2013
Time-resolved photoelectron and photoion fragmentation spectroscopy study of 9-methyladenine and its hydrates: a contribution to the understanding of the ultrafast radiationless decay of excited DNA bases.
The excited state dynamics of the purine base 9-methyladenine (9Me-Ade) has been investigated by time- and energy-resolved photoelectron imaging spectroscopy and mass-selected ion spectroscopy, in both vacuum and water-cluster environments. The specific probe processes used, namely a careful monitoring of time-resolved photoelectron energy distributions and of photoion fragmentation, together with the excellent temporal resolution achieved, enable us to derive additional information on the nature of the excited states (pp*, np*, ps*, triplet) involved in the electronic relaxation of adenine. The two-step pathway we propose to account for the double exponential decay observed agrees well with recent theoretical calculations. The near-UV photophysics of 9Me-Ade is dominated by the direct excitation of the pp* (1Lb) state (lifetime of 100 fs), followed by internal conversion to the np* state (lifetime in the ps range) via conical intersection. No evidence for the involvement of a ps* or a triplet state was found. 9Me- Ade–(H2O)n clusters have been studied, focusing on the fragmentation of these species after the probe process. A careful analysis of the fragments allowed us to provide evidence for a double exponential decay profile for the hydrates. The very weak second component observed, however, led us to conclude that the photophysics were very different compared with the isolated base, assigned to a competition between (i) a direct one-step decay of the initially excited state (pp* La and/or Lb, stabilised by hydration) to the ground state and (ii) a modified two-step decay scheme, qualitatively comparable to that occurring in the isolated molecule
Fluoxetine effects assessment on the life cycle of aquatic invertebrates
International audienceFluoxetine is a serotonin re-uptake inhibitor, generally used as an antidepressant. It is suspected to provoke substantial effects in the aquatic environment. This study reports the effects of fluoxetine on the life cycle of four invertebrate species, Daphnia magna, Hyalella azteca and the snail Potamopyrgus antipodarum exposed to fluoxetine spiked-water and the midge Chironomus riparius exposed to fluoxetine-spiked sediments. For D. magna, a multi-generational study was performed with exposition of newborns from exposed organisms. Effects of fluoxetine could be found at low measured concentrations (around 10 micro g l(-1)), especially for parthenogenetic reproduction of D. magna and P. antipodarum. For daphnids, newborns length was impacted by fluoxetine and the second generation of exposed individuals showed much more pronounced effects than the first one, with a NOEC of 8.9 micro g l(-1). For P. antipodarum, significant decrease of reproduction was found for concentrations around 10 micro g l(-1). In contrast, we found no effect on the reproduction of H. azteca but a significant effect on growth, which resulted in a NOEC of 33 micro g l(-1), expressed in nominal concentration. No effect on C. riparius could be found for measured concentrations up to 59.5 mg kg(-1). General mechanistic energy-based models showed poor relevance for data analysis, which suggests that fluoxetine targets specific mechanisms of reproduction
Mining microarray datasets aided by knowledge stored in literature
DNA microarray technology produces large amounts of data. For data mining
of these datasets, background information on genes can be helpful.
Unfortunately most information is stored in free text. Here, we present an
approach to use this information for DNA microarray data mining
Provenance-Centered Dataset of Drug-Drug Interactions
Over the years several studies have demonstrated the ability to identify
potential drug-drug interactions via data mining from the literature (MEDLINE),
electronic health records, public databases (Drugbank), etc. While each one of
these approaches is properly statistically validated, they do not take into
consideration the overlap between them as one of their decision making
variables. In this paper we present LInked Drug-Drug Interactions (LIDDI), a
public nanopublication-based RDF dataset with trusty URIs that encompasses some
of the most cited prediction methods and sources to provide researchers a
resource for leveraging the work of others into their prediction methods. As
one of the main issues to overcome the usage of external resources is their
mappings between drug names and identifiers used, we also provide the set of
mappings we curated to be able to compare the multiple sources we aggregate in
our dataset.Comment: In Proceedings of the 14th International Semantic Web Conference
(ISWC) 201
Ambiguity of human gene symbols in LocusLink and MEDLINE: creating an inventory and a disambiguation test collection
Genes are discovered almost on a daily basis and new names have to be
found. Although there are guidelines for gene nomenclature, the naming
process is highly creative. Human genes are often named with a gene symbol
and a longer, more descriptive term; the short form is very often an
abbreviation of the long form. Abbreviations in biomedical language are
highly ambiguous, i.e., one gene symbol often refers to more than one
gene.Using an existing abbreviation expansion algorithm,we explore MEDLINE
for the use of human gene symbols derived from LocusLink. It turns out
that just over 40% of these symbols occur in MEDLINE, however, many of
these occurrences are not related to genes. Along the process of making an
inventory, a disambiguation test collection is constructed automatically
Using contextual queries
Search engines generally treat search requests in isolation. The results
for a given query are identical, independent of the user, or the context
in which the user made the request. An approach is demonstrated that
explores implicit contexts as obtained from a document the user is
reading. The approach inserts into an original (web) document
functionality to directly activate context driven queries that yield
related articles obtained from various information sources
Co-occurrence based meta-analysis of scientific texts: retrieving biological relationships between genes
MOTIVATION: The advent of high-throughput experiments in molecular biology creates a need for methods to efficiently extract and use information for large numbers of genes. Recently, the associative concept space (ACS) has been developed for the representation of information extracted from biomedical literature. The ACS is a Euclidean space in which thesaurus concepts are positioned and the distances between concepts indicates their relatedness. The ACS uses co-occurrence of concepts as a source of information. In this paper we evaluate how well the system can retrieve functionally related genes and we compare its performance with a simple gene co-occurrence method. RESULTS: To assess the performance of the ACS we composed a test set of five groups of functionally related genes. With the ACS good scores were obtained for four of the five groups. When compared to the gene co-occurrence method, the ACS is capable of revealing more functional biological relations and can achieve results with less literature available per gene. Hierarchical clustering was performed on the ACS output, as a potential aid to users, and was found to provide useful clusters. Our results suggest that the algorithm can be of value for researchers studying large numbers of genes. AVAILABILITY: The ACS program is available upon request from the authors
- …
