1,409 research outputs found
AutoDiscern: Rating the Quality of Online Health Information with Hierarchical Encoder Attention-based Neural Networks
Patients increasingly turn to search engines and online content before, or in
place of, talking with a health professional. Low quality health information,
which is common on the internet, presents risks to the patient in the form of
misinformation and a possibly poorer relationship with their physician. To
address this, the DISCERN criteria (developed at University of Oxford) are used
to evaluate the quality of online health information. However, patients are
unlikely to take the time to apply these criteria to the health websites they
visit. We built an automated implementation of the DISCERN instrument (Brief
version) using machine learning models. We compared the performance of a
traditional model (Random Forest) with that of a hierarchical encoder
attention-based neural network (HEA) model using two language embeddings, BERT
and BioBERT. The HEA BERT and BioBERT models achieved average F1-macro scores
across all criteria of 0.75 and 0.74, respectively, outperforming the Random
Forest model (average F1-macro = 0.69). Overall, the neural network based
models achieved 81% and 86% average accuracy at 100% and 80% coverage,
respectively, compared to 94% manual rating accuracy. The attention mechanism
implemented in the HEA architectures not only provided 'model explainability'
by identifying reasonable supporting sentences for the documents fulfilling the
Brief DISCERN criteria, but also boosted F1 performance by 0.05 compared to the
same architecture without an attention mechanism. Our research suggests that it
is feasible to automate online health information quality assessment, which is
an important step towards empowering patients to become informed partners in
the healthcare process
Neural networks versus Logistic regression for 30 days all-cause readmission prediction
Heart failure (HF) is one of the leading causes of hospital admissions in the
US. Readmission within 30 days after a HF hospitalization is both a recognized
indicator for disease progression and a source of considerable financial burden
to the healthcare system. Consequently, the identification of patients at risk
for readmission is a key step in improving disease management and patient
outcome. In this work, we used a large administrative claims dataset to
(1)explore the systematic application of neural network-based models versus
logistic regression for predicting 30 days all-cause readmission after
discharge from a HF admission, and (2)to examine the additive value of
patients' hospitalization timelines on prediction performance. Based on data
from 272,778 (49% female) patients with a mean (SD) age of 73 years (14) and
343,328 HF admissions (67% of total admissions), we trained and tested our
predictive readmission models following a stratified 5-fold cross-validation
scheme. Among the deep learning approaches, a recurrent neural network (RNN)
combined with conditional random fields (CRF) model (RNNCRF) achieved the best
performance in readmission prediction with 0.642 AUC (95% CI, 0.640-0.645).
Other models, such as those based on RNN, convolutional neural networks and CRF
alone had lower performance, with a non-timeline based model (MLP) performing
worst. A competitive model based on logistic regression with LASSO achieved a
performance of 0.643 AUC (95%CI, 0.640-0.646). We conclude that data from
patient timelines improve 30 day readmission prediction for neural
network-based models, that a logistic regression with LASSO has equal
performance to the best neural network model and that the use of administrative
data result in competitive performance compared to published approaches based
on richer clinical datasets
Mining Images in Biomedical Publications: Detection and Analysis of Gel Diagrams
Authors of biomedical publications use gel images to report experimental
results such as protein-protein interactions or protein expressions under
different conditions. Gel images offer a concise way to communicate such
findings, not all of which need to be explicitly discussed in the article text.
This fact together with the abundance of gel images and their shared common
patterns makes them prime candidates for automated image mining and parsing. We
introduce an approach for the detection of gel images, and present a workflow
to analyze them. We are able to detect gel segments and panels at high
accuracy, and present preliminary results for the identification of gene names
in these images. While we cannot provide a complete solution at this point, we
present evidence that this kind of image mining is feasible.Comment: arXiv admin note: substantial text overlap with arXiv:1209.148
Term Mapping Using Matrix Operations
We believe that gene name identification is a modular process involving term recognition, classification and mapping. This work\u27s focus is on gene name mapping, and we assume that names are already recognized and classified. We use a combination of two methods to map recognized entities to their appropriate gene identifiers (Entrez GeneIDs): the Trigram Method, and the Network Method. Both methods require preprocessing, using resources from Entrez Gene, to construct a set of method-specific matrices. We first address lexical variation by transforming gene names into their unique "trigrams" (groups of three alphanumeric characters), and perform trigram matching against the preprocessed gene dictionary. For ambiguous gene names, we additionally perform a contextual analysis of the abstract that contains the recognized entity. We have formalized our method as a sequence of matrix manipulations, allowing for a fast and coherent implementation of the algorithm. In this talk, we also show how gene name identification, and text mining in general, can play a critical role in translational medicine. We demonstrate how term identification is useful for establishing a biobibliometric distance between genes and psychiatric disorders
Decentralized provenance-aware publishing with nanopublications
Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable
Dialogue of civilizations in a multipolar world: toward a multicivilizational-multiplex world order
In this article, I explore the relationship between the new multipolar trends related to the emerging powers and the idea of dialogue of civilizations. My starting point is to understand multipolarity as part of a broader epoch making process of transformation of contemporary international society beyond its Western-centric matrix. In the first part of this article, I therefore argue for an analytical understanding that emphasizes the emergence of a new multipolar world of civilizational politics and multiple modernities. In the second part of the article, I reflect on how to counter the risk inherent in the potential antagonistic logic of multipolarity by critically engaging the normative Huntingtonian construction of a multicivilizational-multipolar world order. I argue that the link between dialogue of civilizations and regionalism could represent a critical issue for the future of global peace. In particular, multiculturally constituted processes of regional integration are antidotes to the possible negative politicization of cultural differences on a global scale and can contribute to the emergence of a new cross-cultural jus gentium. These elements are critical to the construction of a realistic dialogue of civilizations in international relations while preventing the risks inherent in its growing multipolar configuration. They shape what, drawing on Amitav Acharya's work, could be named a multicivilizational-multiplex world order
A semantic web framework to integrate cancer omics data with biological knowledge
BACKGROUND: The RDF triple provides a simple linguistic means of describing limitless types of information. Triples can be flexibly combined into a unified data source we call a semantic model. Semantic models open new possibilities for the integration of variegated biological data. We use Semantic Web technology to explicate high throughput clinical data in the context of fundamental biological knowledge. We have extended Corvus, a data warehouse which provides a uniform interface to various forms of Omics data, by providing a SPARQL endpoint. With the querying and reasoning tools made possible by the Semantic Web, we were able to explore quantitative semantic models retrieved from Corvus in the light of systematic biological knowledge. RESULTS: For this paper, we merged semantic models containing genomic, transcriptomic and epigenomic data from melanoma samples with two semantic models of functional data - one containing Gene Ontology (GO) data, the other, regulatory networks constructed from transcription factor binding information. These two semantic models were created in an ad hoc manner but support a common interface for integration with the quantitative semantic models. Such combined semantic models allow us to pose significant translational medicine questions. Here, we study the interplay between a cell's molecular state and its response to anti-cancer therapy by exploring the resistance of cancer cells to Decitabine, a demethylating agent. CONCLUSIONS: We were able to generate a testable hypothesis to explain how Decitabine fights cancer - namely, that it targets apoptosis-related gene promoters predominantly in Decitabine-sensitive cell lines, thus conveying its cytotoxic effect by activating the apoptosis pathway. Our research provides a framework whereby similar hypotheses can be developed easily
Writing clinical practice guidelines in controlled natural language
Clinicians could benefit from decision support systems incorporating the knowledge contained in clinical practice guidelines. However, the unstructured form of these guidelines makes them unsuitable for formal representation. To address this challenge we translated a complete set of pediatric guideline recommendations into Attempto Controlled English (ACE). One experienced pediatrician, one physician and a knowledge engineer assessed that a suitably extended version of ACE can accurately and naturally represent the clinical concepts and the proposed actions of the guidelines. Currently, we are developing a systematic and replicable approach to authoring guideline recommendations in ACE
- …
