1,052 research outputs found
CIMTDetect: A Community Infused Matrix-Tensor Coupled Factorization Based Method for Fake News Detection
Detecting whether a news article is fake or genuine is a crucial task in
today's digital world where it's easy to create and spread a misleading news
article. This is especially true of news stories shared on social media since
they don't undergo any stringent journalistic checking associated with main
stream media. Given the inherent human tendency to share information with their
social connections at a mouse-click, fake news articles masquerading as real
ones, tend to spread widely and virally. The presence of echo chambers (people
sharing same beliefs) in social networks, only adds to this problem of
wide-spread existence of fake news on social media. In this paper, we tackle
the problem of fake news detection from social media by exploiting the very
presence of echo chambers that exist within the social network of users to
obtain an efficient and informative latent representation of the news article.
By modeling the echo-chambers as closely-connected communities within the
social network, we represent a news article as a 3-mode tensor of the structure
- and propose a tensor factorization based method to
encode the news article in a latent embedding space preserving the community
structure. We also propose an extension of the above method, which jointly
models the community and content information of the news article through a
coupled matrix-tensor factorization framework. We empirically demonstrate the
efficacy of our method for the task of Fake News Detection over two real-world
datasets. Further, we validate the generalization of the resulting embeddings
over two other auxiliary tasks, namely: \textbf{1)} News Cohort Analysis and
\textbf{2)} Collaborative News Recommendation. Our proposed method outperforms
appropriate baselines for both the tasks, establishing its generalization.Comment: Presented at ASONAM'1
Comparative study of antibacterial activity of two different earthworm species, Perionyx excavatus and Pheretima posthuma against pathogenic bacteria
Disease outbreaks are being increasingly recognized as a significant constraint on aquaculture production and trade affecting the economic development of the sector in many countries. Extracting and using biologically active compounds from earthworms has traditionally been practiced by indigenous people throughout the world. The aim of the present study was to shown antimicrobial activity through earthworm extract against fish bacterial pathogens. In total, 8 bacterial strains i.e. 6 gram negative viz. Aeromonas hydrophila, Pseudomonas aeruginosa, P. fluorescens, E.coli, Enterobacter aerogens and Shigella sp. and 2 gram positive viz. Staphylococcus aureus and Micrococcus luteus were identified. The extract of earthworm Perionyx excavatus, Pheretima posthuma were prepared and antimicrobial activity of the extract was determined by antimicrobial well diffusion assay. After 24 hrs of incubation period, it was observed that earthworm extract showed antibacterial activity against isolated bacterial strains. Among earthworm extract of two different species, the maximum zone of inhibition was shown against A. hydrophila by Perionyx excavatus (18.33± 0.66 mm) and P. posthuma (16.66±0.33). P. excavatus showed antibacterial activity against all pathogenic bacteria except Shigella spp. However on the other hand, P.posthuma showed antibacterial activity against A. hydrophila, P. fluorescens, E.coli, and S. aureus. The study has proved that earthworm extract can be effectively used for suppression of bacterial infection in fishes and that it can used as potential antimicrobial drug against commercial antibiotic resistance bacteria
Semi-Supervised Recurrent Neural Network for Adverse Drug Reaction Mention Extraction
Social media is an useful platform to share health-related information due to
its vast reach. This makes it a good candidate for public-health monitoring
tasks, specifically for pharmacovigilance. We study the problem of extraction
of Adverse-Drug-Reaction (ADR) mentions from social media, particularly from
twitter. Medical information extraction from social media is challenging,
mainly due to short and highly information nature of text, as compared to more
technical and formal medical reports.
Current methods in ADR mention extraction relies on supervised learning
methods, which suffers from labeled data scarcity problem. The State-of-the-art
method uses deep neural networks, specifically a class of Recurrent Neural
Network (RNN) which are Long-Short-Term-Memory networks (LSTMs)
\cite{hochreiter1997long}. Deep neural networks, due to their large number of
free parameters relies heavily on large annotated corpora for learning the end
task. But in real-world, it is hard to get large labeled data, mainly due to
heavy cost associated with manual annotation. Towards this end, we propose a
novel semi-supervised learning based RNN model, which can leverage unlabeled
data also present in abundance on social media. Through experiments we
demonstrate the effectiveness of our method, achieving state-of-the-art
performance in ADR mention extraction.Comment: Accepted at DTMBIO workshop, CIKM 2017. To appear in BMC
Bioinformatics. Pls cite that versio
LANGUAGE MODELS FOR RARE DISEASE INFORMATION EXTRACTION: EMPIRICAL INSIGHTS AND MODEL COMPARISONS
End-to-end relation extraction (E2ERE) is a crucial task in natural language processing (NLP) that involves identifying and classifying semantic relationships between entities in text. This thesis compares three paradigms for end-to-end relation extraction (E2ERE) in biomedicine, focusing on rare diseases with discontinuous and nested entities. We evaluate Named Entity Recognition (NER) to Relation Extraction (RE) pipelines, sequence-to-sequence models, and generative pre-trained transformer (GPT) models using the RareDis information extraction dataset. Our findings indicate that pipeline models are the most effective, followed closely by sequence-to-sequence models. GPT models, despite having eight times as many parameters, perform worse than sequence-to-sequence models and significantly lag pipeline models. Our results also hold for a second E2ERE dataset for chemical-protein interactions
Top K Relevant Passage Retrieval for Biomedical Question Answering
Question answering is a task that answers factoid questions using a large
collection of documents. It aims to provide precise answers in response to the
user's questions in natural language. Question answering relies on efficient
passage retrieval to select candidate contexts, where traditional sparse vector
space models, such as TF-IDF or BM25, are the de facto method. On the web,
there is no single article that could provide all the possible answers
available on the internet to the question of the problem asked by the user. The
existing Dense Passage Retrieval model has been trained on Wikipedia dump from
Dec. 20, 2018, as the source documents for answering questions. Question
answering (QA) has made big strides with several open-domain and machine
comprehension systems built using large-scale annotated datasets. However, in
the clinical domain, this problem remains relatively unexplored. According to
multiple surveys, Biomedical Questions cannot be answered correctly from
Wikipedia Articles. In this work, we work on the existing DPR framework for the
biomedical domain and retrieve answers from the Pubmed articles which is a
reliable source to answer medical questions. When evaluated on a BioASQ QA
dataset, our fine-tuned dense retriever results in a 0.81 F1 score.Comment: 6 pages, 5 figures. arXiv admin note: text overlap with
arXiv:2004.04906 by other author
- …
