454 research outputs found

    A Machine learning approach to POS tagging

    Get PDF
    We have applied inductive learning of statistical decision trees and relaxation labelling to the Natural Language Processing (NLP) task of morphosyntactic disambiguation (Part Of Speech Tagging). The learning process is supervised and obtains a language model oriented to resolve POS ambiguities. This model consists of a set of statistical decision trees expressing distribution of tags and words in some relevant contexts. The acquired language models are complete enough to be directly used as sets of POS disambiguation rules, and include more complex contextual information than simple collections of n-grams usually used in statistical taggers. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. However, better results can be obtained by translating the trees into rules to feed a flexible relaxation labelling based tagger. In this direction we describe a tagger which is able to use information of any kind (n-grams, automatically acquired constraints, linguistically motivated manually written constraints, etc.), and in particular to incorporate the machine learned decision trees. Simultaneously, we address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation.Postprint (published version

    Discourse Structure in Machine Translation Evaluation

    Full text link
    In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment- and at the system-level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular we show that: (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference tree is positively correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse analysis. Computational Linguistics, 201

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Construint la visió sistèmica del cos humà a partir de la lectura crítica d’una controvèrsia socio-científica

    Get PDF
    En aquest article es mostra el desenvolupament d’una activitat, realitzada a 3r d’ESO i 1r de batxillerat per treballar la visió sistèmica del cos humà a través de la lectura crítica de la entrevista d’un farmacèutic publicada a un diari. L’activitat vol promoure el desenvolu-pament d’aquest coneixement científic i la capacitat per analitzar una controvèrsia socio-científica de manera fonamentada

    Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness

    Full text link
    We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 1: Check-Worthiness. The task asks to predict which claims in a political debate should be prioritized for fact-checking. In particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact checking. We offered the task in both English and Arabic, based on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign. A total of 30 teams registered to participate in the Lab and seven teams actually submitted systems for Task~1. The most successful approaches used by the participants relied on recurrent and multi-layer neural networks, as well as on combinations of distributional representations, on matchings claims' vocabulary against lexicons, and on measures of syntactic dependency. The best systems achieved mean average precision of 0.18 and 0.15 on the English and on the Arabic test datasets, respectively. This leaves large room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in check-worthiness estimation.Comment: Computational journalism, Check-worthiness, Fact-checking, Veracit

    Automatització integral d'un habitatge

    Get PDF
    En aquest projecte discutirem i aprovarem la viabilitat d'implementar una automatització integral d'un habitatge utilitzant les tecnologies domòtiques existents a l'actualitat. La idea inicial és substituir tots els elements que integren la construcció (il·luminació, climatització, parts mòbils,...) per dispositius domòtics i implementar un software de visualització sobre aparells mòbils (smartphones, tablets) que ens permeti un control total sobre l'habitacle. S'avaluarà quina és la solució de mercat que millor s'adapta al projecte i s'implementarà integrant-la posteriorment als sistemes de visualització i control.En este proyecto discutiremos y aprobaremos la viabilidad de implementar una automatización integral de una vivienda utilizando las tecnologías domóticas existentes en la actualidad. La idea inicial es sustituir todos los elementos que integran esta construcción (iluminación, climatización, partes móviles,...) por dispositivos domóticos e implementar un software de visualización sobre aparatos móviles (smartphones, tablets) que nos permita un control total del habitáculo. Se evaluará cual es la solución de mercado que mejor se adapta al proyecto y se implementará integrándola posteriormente a los sistemas de visualización y control.In this project it is discussed the feasibility of implementing a house automation system using current home automation technologies. The idea is to replace all these building elements (lighting, air conditioning, moving parts,...) with home automation devices and to deploy a visualization software implemented on mobile devices (smartphones, tablets) which allows full control of the installation. We evaluate which is the best suited market solution for our project and we implement it on control and display systems
    corecore