180 research outputs found
The Current Status of Historical Preservation Law in Regularory Takings Jurisprudence: Has the Lucas Missile Dismantled Preservation Programs?
This paper describes our NIHRIO system for SemEval-2018 Task 3 "Irony detection in English tweets". We propose to use a simple neural network architecture of Multilayer Perceptron with various types of input features including: lexical, syntactic, semantic and polarity features. Our system achieves very high performance in both subtasks of binary and multi-class irony detection in tweets. In particular, we rank at fifth in terms of the accuracy metric and the F1 metric. Our code is available at: https://github.com/NIHRIO/IronyDetectionInTwitte
Addressing the Burden of Antimicrobial Resistance in Vietnamese Hospitals
Hospital acquired infections (HAIs), especially ventilator associated respiratory infection (VARI) cause significant morbidity and mortality, and disproportionally so in low and middle-income countries (LMICs), including Vietnam, where infection control in hospitals is often neglected. The management of HAIs in these settings is challenging because of the high proportions of antimicrobial drug resistance and limitations of laboratory diagnostics, financial and human resources in terms of knowledge and skills for antimicrobial stewardship and infection prevention and control
A Label Attention Model for ICD Coding from Clinical Text
ICD coding is a process of assigning the International Classification of
Disease diagnosis codes to clinical/medical notes documented by health
professionals (e.g. clinicians). This process requires significant human
resources, and thus is costly and prone to error. To handle the problem,
machine learning has been utilized for automatic ICD coding. Previous
state-of-the-art models were based on convolutional neural networks, using a
single/several fixed window sizes. However, the lengths and interdependence
between text fragments related to ICD codes in clinical text vary
significantly, leading to the difficulty of deciding what the best window sizes
are. In this paper, we propose a new label attention model for automatic ICD
coding, which can handle both the various lengths and the interdependence of
the ICD code related text fragments. Furthermore, as the majority of ICD codes
are not frequently used, leading to the extremely imbalanced data issue, we
additionally propose a hierarchical joint learning mechanism extending our
label attention model to handle the issue, using the hierarchical relationships
among the codes. Our label attention model achieves new state-of-the-art
results on three benchmark MIMIC datasets, and the joint learning mechanism
helps improve the performances for infrequent codes.Comment: In Proceedings of IJCAI 2020 (Main Track
Sentiment classification on polarity reviews: an empirical study using rating-based features
We present a new feature type named rating-based feature and evaluate the contribution of this feature to the task of document-level sentiment analysis. We achieve state-of-the-art results on two publicly available standard polarity movie datasets: on the dataset consisting of 2000 reviews produced by Pang and Lee (2004) we obtain an accuracy of 91.6% while it is 89.87% evaluated on the dataset of 50000 reviews created by Maas et al. (2011). We also get a performance at 93.24% on our own dataset consisting of 233600 movie reviews, and we aim to share this dataset for further research in sentiment polarity analysis task
VnCoreNLP: A Vietnamese Natural Language Processing Toolkit
We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP
annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language
processing (NLP) tasks including word segmentation, part-of-speech (POS)
tagging, named entity recognition (NER) and dependency parsing, and obtains
state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to
provide rich linguistic annotations to facilitate research work on Vietnamese
NLP. Our VnCoreNLP is open-source and available at:
https://github.com/vncorenlp/VnCoreNLPComment: Proceedings of the 2018 Conference of the North American Chapter of
the Association for Computational Linguistics: Demonstrations, NAACL 2018, to
appea
Search Personalization with Embeddings
Recent research has shown that the performance of search personalization depends on the richness of user profiles which normally represent the user’s topical interests. In this paper, we propose a new embedding approach to learning user profiles, where users are embedded on a topical interest space. We then directly utilize the user profiles for search personalization. Experiments on query logs from a major commercial web search engine demonstrate that our embedding approach improves the performance of the search engine and also achieves better search performance than other strong baselines
- …
