552 research outputs found
Semantic construction in feature-based TAG
We propose a semantic construction method for Feature-Based Tree Adjoining Grammar which is based on the derived tree, compare it with related proposals and briefly discuss some implementation possibilities
Sloppy Identity
Although sloppy interpretation is usually accounted for by theories of
ellipsis, it often arises in non-elliptical contexts. In this paper, a theory
of sloppy interpretation is provided which captures this fact. The underlying
idea is that sloppy interpretation results from a semantic constraint on
parallel structures and the theory is shown to predict sloppy readings for
deaccented and paycheck sentences as well as relational-, event-, and
one-anaphora. It is further shown to capture the interaction of sloppy/strict
ambiguity with quantification and binding.Comment: 20 page
A specification language for Lexical Functional Grammars
This paper defines a language L for specifying LFG grammars. This enables
constraints on LFG's composite ontology (c-structures synchronised with
f-structures) to be stated directly; no appeal to the LFG construction
algorithm is needed. We use L to specify schemata annotated rules and the LFG
uniqueness, completeness and coherence principles. Broader issues raised by
this work are noted and discussed.Comment: 6 pages, LaTeX uses eaclap.sty; Procs of Euro ACL-9
Grouping Synonyms by Definitions
We present a method for grouping the synonyms of a lemma according to its
dictionary senses. The senses are defined by a large machine readable
dictionary for French, the TLFi (Tr\'esor de la langue fran\c{c}aise
informatis\'e) and the synonyms are given by 5 synonym dictionaries (also for
French). To evaluate the proposed method, we manually constructed a gold
standard where for each (word, definition) pair and given the set of synonyms
defined for that word by the 5 synonym dictionaries, 4 lexicographers specified
the set of synonyms they judge adequate. While inter-annotator agreement ranges
on that task from 67% to at best 88% depending on the annotator pair and on the
synonym dictionary being considered, the automatic procedure we propose scores
a precision of 67% and a recall of 71%. The proposed method is compared with
related work namely, word sense disambiguation, synonym lexicon acquisition and
WordNet construction
French Order Without Order
International audienceTo account for the semi-free word order of French, Unification Categorial Grammar is extended in two ways. First, verbal valencies are contained in a set rather than in a list. Second, type-raised NP's are described as two-sided functors. The new framework does not overgenerate, i.e. it accepts all and only the sentences which are grammatical. This follows partly from the elimination of false lexical ambiguities -- i.e. ambiguities introduced in order to account for all the possible positions a word can be in within a sentence -- and partly from a system of features constraining the possible combinations
Position statement: Inference in Question Answering
One can only exploit inference in Question-Answering (QA) and assess its contribution systematically, if one knows what inference is contributing to. Thus we identify a set of tasks specific to QA and discuss what inference could contribute to their achievement. We conclude with a proposal for graduated test suites as a tool for assessing the performance and impact of inference
Analysing Data-To-Text Generation Benchmarks
Recently, several data-sets associating data to text have been created to
train data-to-text surface realisers. It is unclear however to what extent the
surface realisation task exercised by these data-sets is linguistically
challenging. Do these data-sets provide enough variety to encourage the
development of generic, high-quality data-to-text surface realisers ? In this
paper, we argue that these data-sets have important drawbacks. We back up our
claim using statistics, metrics and manual evaluation. We conclude by eliciting
a set of criteria for the creation of a data-to-text benchmark which could help
better support the development, evaluation and comparison of linguistically
sophisticated data-to-text surface realisers
Creating Training Corpora for NLG Micro-Planning
International audienceIn this paper, we focus on how to create data-to-text corpora which can support the learning of wide-coverage micro-planners i.e., generation systems that handle lexicalisation, aggregation, surface re-alisation, sentence segmentation and referring expression generation. We start by reviewing common practice in designing training benchmarks for Natural Language Generation. We then present a novel framework for semi-automatically creating linguistically challenging NLG corpora from existing Knowledge Bases. We apply our framework to DBpedia data and compare the resulting dataset with (Wen et al., 2016)'s dataset. We show that while (Wen et al., 2016)'s dataset is more than twice larger than ours, it is less diverse both in terms of input and in terms of text. We thus propose our corpus generation framework as a novel method for creating challenging data sets from which NLG models can be learned which are capable of generating text from KB data
- …
