1,712 research outputs found
Resolving ambiguities in the extraction of syntactic categories through chunking.
In recent years, several authors have investigated how co-occurrence statistics in natural language can act as a cue that children may use to extract syntactic categories for the language they are learning. While some authors have reported encouraging results, it is difficult to evaluate the quality of the syntactic categories derived. It is argued in this paper that traditional measures of accuracy are inherently flawed. A valid evaluation metric needs to consider the well-formedness of utterances generated through a production end. This paper attempts to evaluate the quality of the categories derived from cooccurrence statistics through the use of MOSAIC, a computational model of syntax acquisition that has already been used to simulate several phenomena in child language. It will be shown that derived syntactic categories which may appear to be of high quality will quickly give rise to errors which are not typical of child speech. A solution to this problem is suggested in the form of a chunking mechanism which serves to differentiate between alternative grammatical functions of identical word forms. Results are evaluated in terms of the error rates in utterances produced by the system as well as the quantitative fit to the phenomenon of subject omission
Three fermions with six single particle states can be entangled in two inequivalent ways
Using a generalization of Cayley's hyperdeterminant as a new measure of
tripartite fermionic entanglement we obtain the SLOCC classification of
three-fermion systems with six single particle states. A special subclass of
such three-fermion systems is shown to have the same properties as the
well-known three-qubit ones. Our results can be presented in a unified way
using Freudenthal triple systems based on cubic Jordan algebras. For systems
with an arbitrary number of fermions and single particle states we propose the
Pl\"ucker relations as a sufficient and necessary condition of separability.Comment: 23 pages LATE
Meter based omission of function words in MOSAIC
MOSAIC (Model of Syntax Acquisition in Children) is augmented with a new mechanism that allows for the omission of unstressed function words based on the prosodic structure of the utterance in which they occur. The mechanism allows MOSAIC to omit elements from multiple locations in a target utterance, which it was previously unable to do. It is shown that, although the new mechanism results in Optional Infinitive errors when run on children’s input, it is insufficient to simulate the high rate OI errors in children’s speech unless combined with MOSAIC’s edge-first learning mechanism. It is also shown that the addition of the new mechanism does not adversely affect MOSAIC’s fit to the Optional Infinitive phenomenon. The mechanism does, however, make MOSAIC’s output more child-like, both in terms of the range of utterances it can simulate, and the level and type of determiner omission that the model displays
Simulating the temporal reference of Dutch and English Root Infinitives.
Hoekstra & Hyams (1998) claim that the overwhelming majority of Dutch children’s Root Infinitives (RIs) are used to refer to modal (not realised) events, whereas in English speaking children, the temporal reference of RIs is free. Hoekstra & Hyams attribute this difference to qualitative differences in how temporal reference is carried by the Dutch infinitive and the English bare form. Ingram & Thompson (1996) advocate an input-driven account of this difference and suggest that the modal reading of German (and Dutch) RIs is caused by the fact that infinitive forms are predominantly used in modal contexts. This paper investigates whether an input-driven account can explain the differential reading of RIs in Dutch and English. To this end, corpora of English and Dutch Child Directed Speech were fed through MOSAIC, a computational model that has already been used to simulate the basic Optional Infinitive phenomenon. Infinitive forms in the input were tagged for modal or non-modal reference based on the sentential context in which they appeared. The output of the model was compared to the results of corpus studies and recent experimental data which call into question the strict distinction between Dutch and English advocated by Hoekstra & Hyams
Modelling the development of Dutch Optional Infinitives in MOSAIC.
This paper describes a computational model which simulates the change in the use of optional infinitives that is evident in children learning Dutch as their first language. The model, developed within the framework of MOSAIC, takes naturalistic, child directed speech as its input, and analyses the distributional regularities present in the input. It slowly learns to generate longer utterances as it sees more input. We show that the developmental characteristics of Dutch children’s speech (with respect to optional infinitives) are a natural consequence of MOSAIC’s learning mechanisms and the gradual increase in the length of the utterances it produces. In contrast with Nativist approaches to syntax acquisition, the present model does not assume large amounts of innate knowledge in the child, and provides a quantitative process account of the development of optional infinitives
Recommended from our members
Comparing MOSAIC and the variational learning model of the optional infinitive stage in early child language
This paper compares MOSAIC and the Variational Learning Model (VLM) in terms of their ability to explain the level of finiteness marking in early child Dutch, English, Spanish, German and French. It is shown that both models are successful in explaining cross-linguistic variation in rates of Optional Infinitive (OI) errors, although both models underestimate the error rate in English. A second set of analyses shows strong lexical effects in the pattern of errors across all five languages studied. This finding is problematic for the Variational Learning Model and provides strong support for the notion that OI errors are incomplete compound finites as instantiated in MOSAIC
Modelling syntactic development in a cross-linguistic context
Mainstream linguistic theory has traditionally assumed that children come into the world with rich innate knowledge about language and grammar. More recently, computational work using distributional algorithms has shown that the information contained in the input is much richer than proposed by the nativist approach. However, neither of these approaches has been developed to the point of providing detailed and quantitative predictions about the developmental data. In this paper, we champion a third approach, in which computational models learn from naturalistic input and produce utterances that can be directly compared with the utterances of language-learning children. We demonstrate the feasibility of this approach by showing how MOSAIC, a simple distributional analyser, simulates the optional-infinitive phenomenon in English, Dutch, and Spanish. The model accounts for young children's tendency to use both correct finites and incorrect (optional) infinitives in finite contexts, for the generality of this phenomenon across languages, and for the sparseness of other types of errors (e.g., word order errors). It thus shows how these phenomena, which have traditionally been taken as evidence for innate knowledge of Universal Grammar, can be explained in terms of a simple distributional analysis of the language to which children are exposed
Towards a Unified Model of Language Acquisition
In this theoretical paper, we first review and rebut standard criticisms against distributional approaches to language acquisition. We then present two closely-related models that use distributional analysis. The first deals with the acquisition of vocabulary, the second with grammatical development. We show how these two models can be combined with a semantic network grown using Hebbian learning, and briefly illustrate the advantages of this combination. An important feature of this hybrid system is that it combines two different types of distributional learning, the first based on order, and the second based on co-occurrences within a context
On the Utility of Conjoint and Compositional Frames and Utterance
This paper reports the results of a series of connectionist simulations aimed at establishing the value of different types of contexts as predictors of the grammatical categories of words. A comparison is made between ‘compositional’ frames (Monaghan & Christiansen, 2004), and non-compositional or ‘conjoint’ frames (Mintz, 2003). Attention is given to the role of utterance boundaries both as a category to be predicted and as a predictor. The role of developmental constraints is investigated by examining the effect of restricting the analysis to utterance-final frames. In line with results reported by Monaghan and Christiansen compositional frames are better predictors than conjoint frames, though the latter provide a small performance improvement when combined with compositional frames. Utterance boundaries are shown to be detrimental to performance when included as an item to be predicted while improving performance when included as a predictor. The utility of utterance boundaries is further supported by the finding that when the analysis is restricted to utterance-final frames (which are likely to be a particularly important source of information early in development) frames including utterance boundaries are far better predictors than lexical frames
Simulating optional infinitive errors in child speech through the omission of sentence-internal elements.
A new version of the MOSAIC model of syntax acquisition is presented. The modifications to the model aim to address two weaknesses in its earlier simulations of the Optional Infinitive phenomenon: an over-reliance on questions in the input as the source for Optional Infinitive errors, and the use of an utterance-final bias in learning (recency effect), without a corresponding utterance-initial bias (primacy effect). Where the old version only produced utterance-final phrases, the new version of MOSAIC learns from both the left and right edge of the utterance, and associates utterance-initial and utterancefinal phrases. The new model produces both utterance-final phrases and concatenations of utterance-final and utteranceinitial phrases. MOSAIC now also differentiates between phrases learned from declarative and interrogative input. It will be shown that the new version is capable of simulating the Optional Infinitive phenomenon in English and Dutch without relying on interrogative input. Unlike the previous version of MOSAIC, the new version is also capable of simulating cross-linguistic variation in the occurrence of Optional Infinitive errors in Wh-questions
- …
