7,603 research outputs found
Evaluation methodologies in Automatic Question Generation 2013-2018
In the last few years Automatic Question Generation (AQG) has attracted increasing interest. In this paper we survey the evaluation methodologies used in AQG. Based on a sample of 37 papers, our research shows that the systems’ development has not been accompanied by similar developments in the methodologies used for the systems’ evaluation. Indeed, in the papers we examine here, we find a wide variety of both intrinsic and extrinsic evaluation methodologies. Such diverse evaluation practices make it difficult to reliably compare the quality of different generation systems. Our study suggests that, given the rapidly increasing level of research in the area, a common framework is urgently needed to compare the performance of AQG systems and NLG systems more generally
Recommended from our members
Rethinking the Agreement in Human Evaluation Tasks
Human evaluations are broadly thought to be more valuable the higher the inter-annotator agreement. In this paper we examine this idea. We will describe our experiments and analysis within the area of Automatic Question Generation. Our experiments show how annotators diverge in language annotation tasks due to a range of ineliminable factors. For this reason, we believe that annotation schemes for natural language generation tasks that are aimed at evaluating language quality need to be treated with great care. In particular, an unchecked focus on reduction of disagreement among annotators runs the danger of creating generation goals that reward output that is more distant from, rather than closer to, natural human-like language. We conclude the paper by suggesting a new approach to the use of the agreement metrics in natural language generation evaluation tasks
Using discovered, polyphonic patterns to filter computer-generated music
A metric for evaluating the creativity of a music-generating system is presented, the objective being to generate mazurka-style music that inherits salient patterns from an original excerpt by Frédéric Chopin. The metric acts as a filter within our overall system, causing rejection of generated passages that do not inherit salient patterns, until a generated passage survives. Over fifty iterations, the mean number of generations required until survival was 12.7, with standard deviation 13.2. In the interests of clarity and replicability, the system is described with reference to specific excerpts of music. Four concepts–Markov modelling for generation, pattern discovery, pattern quantification, and statistical testing–are presented quite distinctly, so that the reader might adopt (or ignore) each concept as they wish
A Conversation with Monroe Sirken
Born January 11, 1921 in New York City, Monroe Sirken grew up in a suburb of
Pasadena, California. He earned B.A. and M.A. degrees in sociology at UCLA in
1946 and 1947, and a Ph.D. in 1950 in sociology with a minor in mathematics at
the University of Washington in 1950 where Professor Z. W. Birnbaum was his
mentor and thesis advisor. As a Post-Doctoral Fellow of the Social Science
Research Council, Monroe spent 1950--1951 at the Statistics Laboratory,
University of California at Berkeley and the Office of the Assistant Director
for Research, U.S. Bureau of the Census in Suitland, Maryland. Monroe visited
the Census Bureau at a time of great change in the use of sampling and survey
methods, and decided to remain. He began his government career there in 1951 as
a mathematical statistician, and moved to the National Office of Vital
Statistics (NOVS) in 1953 where he was an actuarial mathematician and a
mathematical statistician. He has held a variety of research and administrative
positions at the National Center for Health Statistics (NCHS) and he was the
Associate Director, Research and Methodology and the Director, Office of
Research and Methodology until 1996 when he became a senior research scientist,
the title he currently holds. Aside from administrative responsibilities,
Monroe's major professional interests have been conducting and fostering survey
and statistical research responsive to the needs of federal statistics. His
interest in the design of rare and sensitive population surveys led to the
development of network sampling which improves precision by linking multiple
selection units to the same observation units. His interest in fostering
research on the cognitive aspects of survey methods led to the establishment of
permanent questionnaire design research laboratories, first at NCHS and later
at other federal statistical agencies here and abroad.Comment: Published in at http://dx.doi.org/10.1214/07-STS245 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Recommended from our members
A comparative evaluation of algorithms for discovering translational patterns in Baroque keyboard works
We consider the problem of intra-opus pattern discovery, that is, the task of discovering patterns of a specified type within a piece of music. A music analyst undertook this task for works by Domenico Scarlattti and Johann Sebastian Bach, forming a benchmark of 'target' patterns. The performance of two existing algorithms and one of our own creation, called SIACT, is evaluated by comparison with this benchmark. SIACT out-performs the existing algorithms with regard to recall and, more often than not, precision. It is demonstrated that in all but the most carefully selected excerpts of music, the two existing algorithms can be affected by what is termed the 'problem of isolated membership'. Central to the relative success of SIACT is our intention that it should address this particular problem. The paper contrasts string-based and geometric approaches to pattern discovery, with an introduction to the latter. Suggestions for future work are given
"Unlike actors, politicians or eminent military men”: The meaning of hard work in working class autobiography
Copyright @ 2010 The Autobiography Societ
An overview and analysis of community bank mergers
With some of the largest mergers in history now taking place in the financial services industry, the fact that consolidation is also occurring among small banking institutions is often overlooked. The factors that are promoting consolidation in the banking industry are also relevant for the smallest banks, namely the need to spread the cost of technological and administrative overhead and the desire to maintain earnings growth. With limited growth opportunities in many rural communities, smaller banks often choose to merge with other nearby rural banks as the means to gain asset size and improve efficiency. ; Using a case study approach that focuses on nineteen rural banks that participated in in-market mergers, this article examines whether smaller community banks that followed this merger strategy realized efficiency gains. The results show that such mergers have usually been successful from both a profitability and a cost efficiency perspective. Further, these gains were typically achieved without closing branch offices. These successes are important to rural bankers as they seek opportunities for consolidation. They are also important from a public policy perspective and should be carefully considered by regulators in their evaluation of small bank mergers.Bank mergers
- …
