7,307 research outputs found
Ordering the suggestions of a spellchecker without using context.
Having located a misspelling, a spellchecker generally offers some suggestions for the intended word. Even without using context, a spellchecker can draw on various types of information in ordering its suggestions. A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency. The improvement in the ordering produced by each enhancement is measured on a large corpus of misspellings. The final version is tested on other corpora against a widely used commercial spellchecker and a research prototype
Devil in Deerskins: My Life with Grey Owl by Anahareo
Review of Anaharea\u27s Devil in Deerskins: My Life with Grey Owl
BNC! Handle with care! Spelling and tagging errors in the BNC
"You loose your no-claims bonus," instead of "You lose your no-claims bonus," is an example of a real-word spelling error. One way to enable a spellchecker to detect such errors is to prime it with information about likely features of the context for "loose" (verb) as compared with "lose". To this end, we extracted all the examples of "loose" used as a verb from the BNC (World edition, text).
There were, apparently, 159 occurrences of "loose" (VVB or VVI). However, on inspection, well over half of these were not verbs at all (tagging errors) and over half of the rest were misspellings of "lose". Only about 15% were actual occurrences of "loose" as a verb.
This prompted us to undertake a small investigation into errors in the BNC. We report on some words that occur more often as misspellings than in their own right - only one of the 63 occurrences of "ail", for example, is correct (possibly OCR errors) - and some words that are always mistagged, such as "haulier" and "glazier" (never NN), and "hanker" and "loiter" (never VV). We note in particular that, if a rare word resembles a common word (in spelling), it is more likely to appear as a misspelling of the common word than as a correct spelling of the rare word. These cases require some modification of an earlier conclusion (Damerau and Mays, 1989) on misspellings of rare words.
We conclude with a discussion of the desirability, or otherwise, of correcting errors in corpora such as the BNC.
The results may be of interest to people who use the BNC as training data or for teaching
Cronyism and Capital Controls: Evidence from Malaysia
The initial impact of the Asian financial crisis in Malaysia reduced the expected value of government subsidies to politically favored firms. Of the estimated 5 billion gain in market value for Mahathir-connected firms during September 1998, approximately 32% was due to the increase in the value of their connections. The evidence suggests Malaysian capital controls provided a screen behind which favored firms could be supported.
A study of the teaching of plane geometry in the junior and senior high schools
Thesis (M.A.)--Boston Universit
Propping and Tunneling
In countries with weak legal systems, there is a great deal of tunnelling by the entrepreneurs who control publicly traded firms. However, under some conditions entrepreneurs prop up their firms, i.e., they use their private funds to benefit minority shareholders. We provide evidence and a model that explains propping. In particular, we suggest that issuing debt can credibly commit an entrepreneur to propping, even though creditors can never take possession of any underlying collateral. This helps to explain why emerging markets with weak institutions sometimes grow rapidly and why they are also subject to frequent economic and financial crises.
Masculindians: Conversations About Indigenous Manhood by Sam McKegney
Review of Sam McKegney’s Masculindians: Conversations About Indigenous Manhood
A new perspective on steady-state cosmology: from Einstein to Hoyle
We recently reported the discovery of an unpublished manuscript by Albert
Einstein in which he attempted a 'steady-state' model of the universe, i.e., a
cosmic model in which the expanding universe remains essentially unchanged due
to a continuous formation of matter from empty space. The manuscript was
apparently written in early 1931, many years before the steady-state models of
Fred Hoyle, Hermann Bondi and Thomas Gold. We compare Einstein's steady-state
cosmology with that of Hoyle, Bondi and Gold and consider the reasons Einstein
abandoned his model. The relevance of steady-state models for today's cosmology
is briefly reviewed.Comment: To be published in the 'Proceedings of the 2014 Institute of Physics
International Conference on the History of Physics', Cambridge University
Press. arXiv admin note: substantial text overlap with arXiv:1504.02873,
arXiv:1402.013
One Hundred Years of the Cosmological Constant: from 'Superfluous Stunt' to Dark Energy
We present a centennial review of the history of the term known as the
cosmological constant. First introduced to the general theory of relativity by
Einstein in 1917 in order to describe a universe that was assumed to be static,
the term fell from favour in the wake of the discovery of the expanding
universe, only to make a dramatic return in recent times. We consider
historical and philosophical aspects of the cosmological constant over four
main epochs: (i) the use of the term in static cosmologies (both Newtonian and
relativistic); (ii) the marginalization of the term following the discovery of
cosmic expansion; (iii) the use of the term to address specific cosmic puzzles
such as the timespan of expansion, the formation of galaxies and the redshifts
of the quasars; (iv) the re-emergence of the term in today's Lamda-CDM
cosmology. We find that the cosmological constant was never truly banished from
theoretical models of the universe, but was sidelined by astronomers for
reasons of convenience. We also find that the return of the term to the
forefront of modern cosmology did not occur as an abrupt paradigm shift due to
one particular set of observations, but as the result of a number of empirical
advances such as the measurement of present cosmic expansion using the Hubble
Space Telescope, the measurement of past expansion using type SN 1a supernovae
as standard candles, and the measurement of perturbations in the cosmic
microwave background by balloon and satellite. We give a brief overview of
contemporary interpretations of the physics underlying the cosmic constant and
conclude with a synopsis of the famous cosmological constant problem.Comment: 60 pages, 6 figures. Some corrections, additions and extra
references. Accepted for publication the European Physical Journal (H
A large list of confusion sets for spellchecking assessed against a corpus of real-word errors
One of the methods that has been proposed for dealing with real-word errors (errors that occur when a correctly spelled word is substituted for the one intended) is the "confusion-set" approach - a confusion set being a small group of words that are likely to be confused with one another. Using a list of confusion sets drawn up in advance, a spellchecker, on finding one of these words in a text, can assess whether one of the other members of its set would be a better fit and, if it appears to be so, propose that word as a correction. Much of the research using this approach has suffered from two weaknesses. The first is the small number of confusion sets used. The second is that systems have largely been tested on artificial errors. In this paper we address these two weaknesses. We describe the creation of a realistically sized list of confusion sets, then the assembling of a corpus of real-word errors, and then we assess the potential of that list in relation to that corpus
- …
