222 research outputs found
Challenges and solutions for Latin named entity recognition
Although spanning thousands of years and genres as diverse as liturgy, historiography, lyric and other forms of prose and poetry, the body of Latin texts is still relatively sparse compared to English. Data sparsity in Latin presents a number of challenges for traditional Named Entity
Recognition techniques. Solving such challenges and enabling reliable Named Entity Recognition in Latin texts can facilitate many down-stream applications, from machine translation to digital historiography, enabling Classicists, historians, and archaeologists for instance, to track
the relationships of historical persons, places, and groups on a large scale. This paper presents the first annotated corpus for evaluating Named Entity Recognition in Latin, as well as a fully supervised model that achieves over 90% F-score on a held-out test set, significantly outperforming a competitive baseline. We also present a novel active learning strategy that predicts how many and which sentences need to be annotated for named entities in order to attain a specified degree
of accuracy when recognizing named entities automatically in a given text. This maximizes the productivity of annotators while simultaneously controlling quality
Deep Memory Networks for Attitude Identification
We consider the task of identifying attitudes towards a given set of entities
from text. Conventionally, this task is decomposed into two separate subtasks:
target detection that identifies whether each entity is mentioned in the text,
either explicitly or implicitly, and polarity classification that classifies
the exact sentiment towards an identified entity (the target) into positive,
negative, or neutral.
Instead, we show that attitude identification can be solved with an
end-to-end machine learning architecture, in which the two subtasks are
interleaved by a deep memory network. In this way, signals produced in target
detection provide clues for polarity classification, and reversely, the
predicted polarity provides feedback to the identification of targets.
Moreover, the treatments for the set of targets also influence each other --
the learned representations may share the same semantics for some targets but
vary for others. The proposed deep memory network, the AttNet, outperforms
methods that do not consider the interactions between the subtasks or those
among the targets, including conventional machine learning methods and the
state-of-the-art deep learning models.Comment: Accepted to WSDM'1
Learning perceptually grounded word meanings from unaligned parallel data
In order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach to grounded language acquisition which is capable of jointly learning a policy for following natural language commands such as “Pick up the tire pallet,” as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words “the tire pallet” and a specific object in the environment. Our approach assumes a parametric form for the policy that the robot uses to choose actions in response to a natural language command that factors based on the structure of the language. We use a gradient method to optimize model parameters. Our evaluation demonstrates the effectiveness of the model on a corpus of commands given to a robotic forklift by untrained users.U.S. Army Research Laboratory (Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016)United States. Office of Naval Research (MURIs N00014-07-1-0749)United States. Army Research Office (MURI N00014-11-1-0688)United States. Defense Advanced Research Projects Agency (DARPA BOLT program under contract HR0011-11-2-0008
Plasma enhanced atomic layer etching of high-k layers on WS2
D. Marinov has received funding from the European Unions Horizon 2020 research and innovation program under Marie Sklodowska-Curie, Grant Agreement No. 752164. J.-F. de Marneffe received funding from the Graphene Flagship, Grant Agreement No. 952792. All authors acknowledge the support of imecs beyond CMOS program, imecs pilot line, and imecs materials and characterization (MCA) group. The authors thank Dr. Laura Nyns (imec) for the ALD of ZrO 2 and HfO 2 films and Dr. Benjamin Groven (imec) for the deposition of WS 2 films by PEALD
CoNLL 2017 Shared Task : Multilingual Parsing from Raw Text to Universal Dependencies
The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, one of two tasks was devoted to learning dependency parsers for a large number of languages, in a real world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe data preparation, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.Peer reviewe
Towards Computing Inferences from English News Headlines
Newspapers are a popular form of written discourse, read by many people,
thanks to the novelty of the information provided by the news content in it. A
headline is the most widely read part of any newspaper due to its appearance in
a bigger font and sometimes in colour print. In this paper, we suggest and
implement a method for computing inferences from English news headlines,
excluding the information from the context in which the headlines appear. This
method attempts to generate the possible assumptions a reader formulates in
mind upon reading a fresh headline. The generated inferences could be useful
for assessing the impact of the news headline on readers including children.
The understandability of the current state of social affairs depends greatly on
the assimilation of the headlines. As the inferences that are independent of
the context depend mainly on the syntax of the headline, dependency trees of
headlines are used in this approach, to find the syntactical structure of the
headlines and to compute inferences out of them.Comment: PACLING 2019 Long paper, 15 page
A Formal Framework for Modelling Coercion Resistance and Receipt Freeness
Abstract. Coercion resistance and receipt freeness are critical proper-ties for any voting system. However, many different definitions of these properties have been proposed, some formal and some informal; and there has been little attempt to tie these definitions together or identify rela-tions between them. We give here a general framework for specifying different coercion re-sistance and receipt freeness properties using the process algebra CSP. The framework is general enough to accommodate a wide range of defini-tions, and strong enough to cover both randomization attacks and forced abstention attacks. We provide models of some simple voting systems, and show how the framework can be used to analyze these models un-der different definitions of coercion resistance and receipt freeness. Our formalisation highlights the variation between the definitions, and the importance of understanding the relations between them.
Automated Anonymity Verification of the ThreeBallot Voting System
In recent years, a large number of secure voting protocols have been proposed in the literature. Often these protocols contain flaws, but because they are complex protocols, rigorous formal analysis has proven hard to come by. Rivest’s ThreeBallot voting system is important because it aims to provide security (voter anonymity and voter verifiability) without requiring cryptography. In this paper, we construct a CSP model of ThreeBallot, and use it to produce the first automated formal analysis of its anonymity property. Along the way, we discover that one of the crucial assumptions under which ThreeBallot (and many other voting systems) operates-the Short Ballot Assumption-is highly ambiguous in the literature.We give various plausible precise interpretations, and discover that in each case, the interpretation either is unrealistically strong, or else fails to ensure anonymity. Therefore, we give a version of the Short Ballot Assumption for ThreeBallot that is realistic but still provides a guarantee of anonymity
- …
