209 research outputs found

    Analysis and visualisation of RDF resources in Ondex

    Get PDF
    An increasing number of biomedical resources provide their information on the Semantic Web and this creates the basis for a distributed knowledge base which has the potential to advance biomedical research [1]. This potential, however, cannot be realized until researchers from the life sciences can interact with information in the Semantic Web. In particular, there is a need for tools that provide data reduction, visualization and interactive analysis capabilities.
Ondex is a data integration and visualization platform developed to support Systems Biology Research [2]. At its core is a data model based on two main principles: first, all information can be represented as a graph and, second, all elements of the graph can be annotated with ontologies. This data model conforms to the Semantic Web framework, in particular to RDF, and therefore Ondex is ideally positioned as a platform that can exploit the semantic web. 
The Ondex system offers a range of features and analysis methods of potential value to semantic web users, including:
-	An interactive graph visualization interface (Ondex user client), which provides data reduction and representation methods that leverage the ontological annotation.
-	A suite of importers from a variety of data sources to Ondex (http://ondex.org/formats.html)
-	A collection of plug-ins which implement graph analysis, graph transformation and graph-matching functions.
-	An integration toolkit (Ondex Integrator) which allows users to compose workflows from these modular components
-	In addition, all importers and plug-ins are available as web-services which can be integrated in other tools, as for instance Taverna [3].
The developments that will be presented in this demo have made this functionality interoperable with the Semantic Web framework. In particular we have developed an interactive importer, based on SPARQL that allows the query-driven construction of datasets which brings together information from different RDF data resources into Ondex.
These datasets can then be further refined, analysed and annotated both interactively using the Ondex user client and via user-defined workflows. The results of these analyses can be exported in RDF, which can be used to enrich existent knowledge bases, or to provide application-specific views of the data. Both importer and exporter only focus on a subset of the Ondex and RDF data models, which are shared between these two data representations [4].
In this demo we will show how Ondex can be used to query, analyse and visualize Semantic Web knowledge bases. In particular we will present real use cases focused, but not limited to, resources relevant to plant biology. 
We believe that Ondex can be a valid contribution to the adoption of the Semantic Web in Systems Biology research and in biomedical investigation more generally. We welcome feedback on our current import/export prototype and suggestions for the advancement of Ondex for the Semantic Web.

References

1.	Ruttenberg, A. et. al.: Advancing translational research with the Semantic Web, BMC Bioinformatics, 8 (Suppl. 3): S2 (2007).
2.	Köhler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Ruegg, A., Rawlings, C., Verrier, P., Philippi, S.: Graph-based analysis and visualization of experimental results with Ondex. Bioinformatics 22 (11):1383-1390 (2006).
3.	Rawlings, C.: Semantic Data Integration for Systems Biology Research, Technology Track at ISMB’09, http://www.iscb.org/uploaded/css/36/11846.pdf (2009).
4.	Splendiani, A. et. al.: Ondex semantic definition, (Web document) http://ondex.svn.sourceforge.net/viewvc/ondex/trunk/doc/semantics/ (2009).
&#xa

    Haplotype characterization of a stranded common minke whale calf (Balaenoptera acutorostrata lacépède, 1804): Is the mediterranean sea a potential calving or nursery ground for the species?

    Get PDF
    The stranding of a suckling calf of Common Minke Whale (Balaenoptera acutorostrata) on the coast near Salerno (Campania, Southern Italy) is reported. The molecular analysis of a partial sequence of the mitochondrial DNA control region shows that the animal bore a haplotype identical to haplotype Ba169 considered as typical of individuals from North Atlantic population. Historical data and our results suggest the possibility that the Mediterranean Sea might be a potential calving or nursery ground for this species

    Towards linked open gene mutations data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the advent of high-throughput technologies, a great wealth of variation data is being produced. Such information may constitute the basis for correlation analyses between genotypes and phenotypes and, in the future, for personalized medicine. Several databases on gene variation exist, but this kind of information is still scarce in the Semantic Web framework.</p> <p>In this paper, we discuss issues related to the integration of mutation data in the Linked Open Data infrastructure, part of the Semantic Web framework. We present the development of a mapping from the IARC TP53 Mutation database to RDF and the implementation of servers publishing this data.</p> <p>Methods</p> <p>A version of the IARC TP53 Mutation database implemented in a relational database was used as first test set. Automatic mappings to RDF were first created by using D2RQ and later manually refined by introducing concepts and properties from domain vocabularies and ontologies, as well as links to Linked Open Data implementations of various systems of biomedical interest.</p> <p>Since D2RQ query performances are lower than those that can be achieved by using an RDF archive, generated data was also loaded into a dedicated system based on tools from the Jena software suite.</p> <p>Results</p> <p>We have implemented a D2RQ Server for TP53 mutation data, providing data on a subset of the IARC database, including gene variations, somatic mutations, and bibliographic references. The server allows to browse the RDF graph by using links both between classes and to external systems. An alternative interface offers improved performances for SPARQL queries. The resulting data can be explored by using any Semantic Web browser or application.</p> <p>Conclusions</p> <p>This has been the first case of a mutation database exposed as Linked Data. A revised version of our prototype, including further concepts and IARC TP53 Mutation database data sets, is under development.</p> <p>The publication of variation information as Linked Data opens new perspectives: the exploitation of SPARQL searches on mutation data and other biological databases may support data retrieval which is presently not possible. Moreover, reasoning on integrated variation data may support discoveries towards personalized medicine.</p

    Native or overlooked translocation? comment on Antognazza et al. Current and historical genetic variability of native brown trout populations in a Southern Alpine ecosystem: implications for future management. Fishes 2023, 8, 411

    Get PDF
    The recent revision of Italian legislation on nature conservation has highlighted the press- ing necessity of elucidating the native distribution range of managed species. A recent study by Antognazza et al. (Current and Historical Genetic Variability of Native Brown Trout Populations in a Southern Alpine Ecosystem: Implications for Future Management. Fishes 2023, 8, 411) provides insights into the native status of brown trout in the Lombardy Prealps, northern Italy, and advocates urgent conservation measures. However, the possible effect of historical and recent anthropogenic impacts was dismissed in the paper. Here, we present how human-mediated activities plausibly might contribute to the observed distribution of population genetic variation, considering both the available literature and ongoing “Mediterranean trout” stocking activities in the region. Imple- menting management strategies without clear scientific evidence poses significant risks to native biodiversity conservation

    Phylogenetic and biogeographic history of brook lampreys (Lampetra: Petromyzontidae) in the river basins of the Adriatic Sea based on DNA barcode data.

    Get PDF
    The Adriatic brook lamprey, Lampetra zanandreai Vladykov 1955, was described from northeastern Italy. Its distribution is thought to include left tributaries of the River Po and the river basins of the Adriatic Sea from the River Po to the River Isonzo/Soča in Italy, Switzerland and Slovenia. It also shows a geographically isolated distribution in the Potenza River on the Adriatic slope in Central Italy. Lampetra from the Neretva River system in Croatia and Bosnia and Herzegovina and the Morača River system in Montenegro that were previously identified as L. zanandreai were recently described as a new species Lampetra soljani Tutman, Freyhof, Dulčić, Glamuzina & Geiger 2017 based on morphological data and a genetic distance between the two species of roughly 2.5% in the DNA barcoding gene cytochrome oxidase I (COI). Since DNA barcodes for L. zanandreai are only available for one population from the upper Po River in northwestern Italy, we generated additional COI nucleotide sequence data of this species from Switzerland, northeastern and central Italy comprising near topotypic material and obtained GenBank sequences of the species from Slovenia to better assess the evolutionary history of the two brook lamprey species in the river basins of the Adriatic Sea. Our data show a low sequence divergence of <1% between L. zanandreai from Switzerland, northeastern and central Italy and Slovenia and the Balkan species L. soljani. However, members of the population previously identified as 'L. zanandreai' from northwest Italy are genetically highly divergent from those of L. zanandreai and likely belong to an undescribed species, L. sp. 'upper Po'. The presence of a unique and highly divergent brook lamprey lineage in the upper Po River suggests that L. zanandreai and Lampetra sp. 'upper Po' may have evolved in separate paleo drainages during the formation of the modern Po Valley subsequent to marine inundations in the Pliocene

    Biomedical semantics in the Semantic Web

    Get PDF
    The Semantic Web offers an ideal platform for representing and linking biomedical information, which is a prerequisite for the development and application of analytical tools to address problems in data-intensive areas such as systems biology and translational medicine. As for any new paradigm, the adoption of the Semantic Web offers opportunities and poses questions and challenges to the life sciences scientific community: which technologies in the Semantic Web stack will be more beneficial for the life sciences? Is biomedical information too complex to benefit from simple interlinked representations? What are the implications of adopting a new paradigm for knowledge representation? What are the incentives for the adoption of the Semantic Web, and who are the facilitators? Is there going to be a Semantic Web revolution in the life sciences

    Gauging triple stores with actual biological data

    Get PDF
    Background: Semantic Web technologies have been developed to overcome the limitations of the current Web and conventional data integration solutions. The Semantic Web is expected to link all the data present on the Internet instead of linking just documents. One of the foundations of the Semantic Web technologies is the knowledge representation language Resource Description Framework (RDF). Knowledge expressed in RDF is typically stored in so-called triple stores (also known as RDF stores), from which it can be retrieved with SPARQL, a language designed for querying RDF-based models. The Semantic Web technologies should allow federated queries over multiple triple stores. In this paper we compare the efficiency of a set of biologically relevant queries as applied to a number of different triple store implementations. Results: Previously we developed a library of queries to guide the use of our knowledge base Cell Cycle Ontology implemented as a triple store. We have now compared the performance of these queries on five non-commercial triple stores: OpenLink Virtuoso (Open-Source Edition), Jena SDB, Jena TDB, SwiftOWLIM and 4Store. We examined three performance aspects: the data uploading time, the query execution time and the scalability. The queries we had chosen addressed diverse ontological or biological questions, and we found that individual store performance was quite query-specific. We identified three groups of queries displaying similar behaviour across the different stores: 1) relatively short response time queries, 2) moderate response time queries and 3) relatively long response time queries. SwiftOWLIM proved to be a winner in the first group, 4Store in the second one and Virtuoso in the third one. Conclusions: Our analysis showed that some queries behaved idiosyncratically, in a triple store specific manner, mainly with SwiftOWLIM and 4Store. Virtuoso, as expected, displayed a very balanced performance - its load time and its response time for all the tested queries were better than average among the selected stores; it showed a very good scalability and a reasonable run-to-run reproducibility. Jena SDB and Jena TDB were consistently slower than the other three implementations. Our analysis demonstrated that most queries developed for Virtuoso could be successfully used for other implementations.© 2012 Mironov et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited

    Pug-Headedness Anomaly in a Wild and Isolated Population of Native Mediterranean Trout Salmo trutta L., 1758 Complex (Osteichthyes: Salmonidae)

    Get PDF
    Skeletal anomalies are commonplace among farmed fish. The pug-headedness anomaly is an osteological condition that results in the deformation of the maxilla, pre-maxilla, and infraorbital bones. Here, we report the first record of pug-headedness in an isolated population of the critically endangered native Mediterranean trout Salmo trutta L., 1758 complex from Sardinia, Italy. Fin clips were collected for the molecular analyses (D-loop, LDH-C1* locus. and 11 microsatellites). A jaw index (JI) was used to classify jaw deformities. Ratios between the values of morphometric measurements of the head and body length were calculated and plotted against values of body length to identify the ratios that best discriminated between malformed and normal trout. Haplotypes belonging to the AD lineage and the genotype LDH-C1*100/100 were observed in all samples, suggesting high genetic integrity of the population. The analysis of 11 microsatellites revealed that observed heterozygosity was similar to the expected one, suggesting the absence of inbreeding or outbreeding depression. The frequency of occurrence of pug-headedness was 12.5% (two out of 16). One specimen had a strongly blunted forehead and an abnormally short upper jaw, while another had a slightly anomaly asymmetrical jaw. Although sample size was limited, variation in environmental factors during larval development seemed to be the most likely factors to trigger the deformities

    A power law global error model for the identification of differentially expressed genes in microarray data

    Get PDF
    BACKGROUND: High-density oligonucleotide microarray technology enables the discovery of genes that are transcriptionally modulated in different biological samples due to physiology, disease or intervention. Methods for the identification of these so-called "differentially expressed genes" (DEG) would largely benefit from a deeper knowledge of the intrinsic measurement variability. Though it is clear that variance of repeated measures is highly dependent on the average expression level of a given gene, there is still a lack of consensus on how signal reproducibility is linked to signal intensity. The aim of this study was to empirically model the variance versus mean dependence in microarray data to improve the performance of existing methods for identifying DEG. RESULTS: In the present work we used data generated by our lab as well as publicly available data sets to show that dispersion of repeated measures depends on location of the measures themselves following a power law. This enables us to construct a power law global error model (PLGEM) that is applicable to various Affymetrix GeneChip data sets. A new DEG identification method is therefore proposed, consisting of a statistic designed to make explicit use of model-derived measurement spread estimates and a resampling-based hypothesis testing algorithm. CONCLUSIONS: The new method provides a control of the false positive rate, a good sensitivity vs. specificity trade-off and consistent results with varying number of replicates and even using single samples
    corecore