454 research outputs found
A Machine learning approach to POS tagging
We have applied inductive learning of statistical decision trees
and relaxation labelling to the Natural Language Processing (NLP)
task of morphosyntactic disambiguation (Part Of Speech Tagging).
The learning process is supervised and obtains a language
model oriented to resolve POS ambiguities. This model consists
of a set of statistical decision trees expressing distribution of
tags and words in some relevant contexts.
The acquired language models are complete enough to be directly
used as sets of POS disambiguation rules, and include more complex
contextual information than simple collections of n-grams usually
used in statistical taggers.
We have implemented a quite simple and fast tagger that has been
tested and evaluated on the Wall Street Journal (WSJ) corpus with
a remarkable accuracy.
However, better results can be obtained by translating the trees
into rules to feed a flexible relaxation labelling based tagger.
In this direction we describe a tagger which is able to use
information of any kind (n-grams, automatically acquired constraints,
linguistically motivated manually written constraints, etc.), and in
particular to incorporate the machine learned decision trees.
Simultaneously, we address the problem of tagging when only
small training material is available, which is crucial in any process
of constructing, from scratch, an annotated corpus. We show that quite
high accuracy can be achieved with our system in this situation.Postprint (published version
Discourse Structure in Machine Translation Evaluation
In this article, we explore the potential of using sentence-level discourse
structure for machine translation evaluation. We first design discourse-aware
similarity measures, which use all-subtree kernels to compare discourse parse
trees in accordance with the Rhetorical Structure Theory (RST). Then, we show
that a simple linear combination with these measures can help improve various
existing machine translation evaluation metrics regarding correlation with
human judgments both at the segment- and at the system-level. This suggests
that discourse information is complementary to the information used by many of
the existing evaluation metrics, and thus it could be taken into account when
developing richer evaluation metrics, such as the WMT-14 winning combined
metric DiscoTKparty. We also provide a detailed analysis of the relevance of
various discourse elements and relations from the RST parse trees for machine
translation evaluation. In particular we show that: (i) all aspects of the RST
tree are relevant, (ii) nuclearity is more useful than relation type, and (iii)
the similarity of the translation RST tree to the reference tree is positively
correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse
analysis. Computational Linguistics, 201
Introduction to the special issue on cross-language algorithms and applications
With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of
Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special
issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment
analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version
Construint la visió sistèmica del cos humà a partir de la lectura crítica d’una controvèrsia socio-científica
En aquest article es mostra el desenvolupament d’una activitat, realitzada a 3r d’ESO i 1r de batxillerat per treballar la visió sistèmica del cos humà a través de la lectura crítica de la entrevista d’un farmacèutic publicada a un diari. L’activitat vol promoure el desenvolu-pament d’aquest coneixement científic i la capacitat per analitzar una controvèrsia socio-científica de manera fonamentada
La intervenció arqueològica al carrer de la Font 7-9. Un abocament de materials del segle XIV a la zona del Call de Tàrrega
Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness
We present an overview of the CLEF-2018 CheckThat! Lab on Automatic
Identification and Verification of Political Claims, with focus on Task 1:
Check-Worthiness. The task asks to predict which claims in a political debate
should be prioritized for fact-checking. In particular, given a debate or a
political speech, the goal was to produce a ranked list of its sentences based
on their worthiness for fact checking. We offered the task in both English and
Arabic, based on debates from the 2016 US Presidential Campaign, as well as on
some speeches during and after the campaign. A total of 30 teams registered to
participate in the Lab and seven teams actually submitted systems for Task~1.
The most successful approaches used by the participants relied on recurrent and
multi-layer neural networks, as well as on combinations of distributional
representations, on matchings claims' vocabulary against lexicons, and on
measures of syntactic dependency. The best systems achieved mean average
precision of 0.18 and 0.15 on the English and on the Arabic test datasets,
respectively. This leaves large room for further improvement, and thus we
release all datasets and the scoring scripts, which should enable further
research in check-worthiness estimation.Comment: Computational journalism, Check-worthiness, Fact-checking, Veracit
Automatització integral d'un habitatge
En aquest projecte discutirem i aprovarem la viabilitat d'implementar una automatització integral d'un habitatge utilitzant les tecnologies domòtiques existents a l'actualitat. La idea inicial és substituir tots els elements que integren la construcció (il·luminació, climatització, parts mòbils,...) per dispositius domòtics i implementar un software de visualització sobre aparells mòbils (smartphones, tablets) que ens permeti un control total sobre l'habitacle. S'avaluarà quina és la solució de mercat que millor s'adapta al projecte i s'implementarà integrant-la posteriorment als sistemes de visualització i control.En este proyecto discutiremos y aprobaremos la viabilidad de implementar una automatización integral de una vivienda utilizando las tecnologías domóticas existentes en la actualidad. La idea inicial es sustituir todos los elementos que integran esta construcción (iluminación, climatización, partes móviles,...) por dispositivos domóticos e implementar un software de visualización sobre aparatos móviles (smartphones, tablets) que nos permita un control total del habitáculo. Se evaluará cual es la solución de mercado que mejor se adapta al proyecto y se implementará integrándola posteriormente a los sistemas de visualización y control.In this project it is discussed the feasibility of implementing a house automation system using current home automation technologies. The idea is to replace all these building elements (lighting, air conditioning, moving parts,...) with home automation devices and to deploy a visualization software implemented on mobile devices (smartphones, tablets) which allows full control of the installation. We evaluate which is the best suited market solution for our project and we implement it on control and display systems
- …
