431 research outputs found
Automatic Quality Estimation for ASR System Combination
Recognizer Output Voting Error Reduction (ROVER) has been widely used for
system combination in automatic speech recognition (ASR). In order to select
the most appropriate words to insert at each position in the output
transcriptions, some ROVER extensions rely on critical information such as
confidence scores and other ASR decoder features. This information, which is
not always available, highly depends on the decoding process and sometimes
tends to over estimate the real quality of the recognized words. In this paper
we propose a novel variant of ROVER that takes advantage of ASR quality
estimation (QE) for ranking the transcriptions at "segment level" instead of:
i) relying on confidence scores, or ii) feeding ROVER with randomly ordered
hypotheses. We first introduce an effective set of features to compensate for
the absence of ASR decoder information. Then, we apply QE techniques to perform
accurate hypothesis ranking at segment-level before starting the fusion
process. The evaluation is carried out on two different tasks, in which we
respectively combine hypotheses coming from independent ASR systems and
multi-microphone recordings. In both tasks, it is assumed that the ASR decoder
information is not available. The proposed approach significantly outperforms
standard ROVER and it is competitive with two strong oracles that e xploit
prior knowledge about the real quality of the hypotheses to be combined.
Compared to standard ROVER, the abs olute WER improvements in the two
evaluation scenarios range from 0.5% to 7.3%
Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English
The necessity of using a fixed-size word vocabulary in order to control the
model complexity in state-of-the-art neural machine translation (NMT) systems
is an important bottleneck on performance, especially for morphologically rich
languages. Conventional methods that aim to overcome this problem by using
sub-word or character-level representations solely rely on statistics and
disregard the linguistic properties of words, which leads to interruptions in
the word structure and causes semantic and syntactic losses. In this paper, we
propose a new vocabulary reduction method for NMT, which can reduce the
vocabulary of a given input corpus at any rate while also considering the
morphological properties of the language. Our method is based on unsupervised
morphology learning and can be, in principle, used for pre-processing any
language pair. We also present an alternative word segmentation method based on
supervised morphological analysis, which aids us in measuring the accuracy of
our model. We evaluate our method in Turkish-to-English NMT task where the
input language is morphologically rich and agglutinative. We analyze different
representation methods in terms of translation accuracy as well as the semantic
and syntactic properties of the generated output. Our method obtains a
significant improvement of 2.3 BLEU points over the conventional vocabulary
reduction technique, showing that it can provide better accuracy in open
vocabulary translation of morphologically rich languages.Comment: The 20th Annual Conference of the European Association for Machine
Translation (EAMT), Research Paper, 12 page
Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources
Translation capability of a Phrase-Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efficiently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En-Fr and Fr-En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out-of-vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.JRC.G.2-Global security and crisis managemen
DNN adaptation by automatic quality estimation of ASR hypotheses
In this paper we propose to exploit the automatic Quality Estimation (QE) of
ASR hypotheses to perform the unsupervised adaptation of a deep neural network
modeling acoustic probabilities. Our hypothesis is that significant
improvements can be achieved by: i)automatically transcribing the evaluation
data we are currently trying to recognise, and ii) selecting from it a subset
of "good quality" instances based on the word error rate (WER) scores predicted
by a QE component. To validate this hypothesis, we run several experiments on
the evaluation data sets released for the CHiME-3 challenge. First, we operate
in oracle conditions in which manual transcriptions of the evaluation data are
available, thus allowing us to compute the "true" sentence WER. In this
scenario, we perform the adaptation with variable amounts of data, which are
characterised by different levels of quality. Then, we move to realistic
conditions in which the manual transcriptions of the evaluation data are not
available. In this case, the adaptation is performed on data selected according
to the WER scores "predicted" by a QE component. Our results indicate that: i)
QE predictions allow us to closely approximate the adaptation results obtained
in oracle conditions, and ii) the overall ASR performance based on the proposed
QE-driven adaptation method is significantly better than the strong, most
recent, CHiME-3 baseline.Comment: Computer Speech & Language December 201
Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018
This paper describes FBK's submission to the end-to-end English-German speech
translation task at IWSLT 2018. Our system relies on a state-of-the-art model
based on LSTMs and CNNs, where the CNNs are used to reduce the temporal
dimension of the audio input, which is in general much higher than machine
translation input. Our model was trained only on the audio-to-text parallel
data released for the task, and fine-tuned on cleaned subsets of the original
training corpus. The addition of weight normalization and label smoothing
improved the baseline system by 1.0 BLEU point on our validation set. The final
submission also featured checkpoint averaging within a training run and
ensemble decoding of models trained during multiple runs. On test data, our
best single model obtained a BLEU score of 9.7, while the ensemble obtained a
BLEU score of 10.24.Comment: 6 pages, 2 figures, system description at the 15th International
Workshop on Spoken Language Translation (IWSLT) 201
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation
(NMT) models by means of a shared dynamic vocabulary. Our approach allows to
extend an initial model for a given language pair to cover new languages by
adapting its vocabulary as long as new data become available (i.e., introducing
new vocabulary items if they are not included in the initial model). The
parameter transfer mechanism is evaluated in two scenarios: i) to adapt a
trained single language NMT system to work with a new language pair and ii) to
continuously add new language pairs to grow to a multilingual NMT system. In
both the scenarios our goal is to improve the translation performance, while
minimizing the training convergence time. Preliminary experiments spanning five
languages with different training data sizes (i.e., 5k and 50k parallel
sentences) show a significant performance gain ranging from +3.85 up to +13.63
BLEU in different language directions. Moreover, when compared with training an
NMT model from scratch, our transfer-learning approach allows us to reach
higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language
Translation (IWSLT), 201
Wrapping up a Summary: from Representation to Generation
The main focus of this work is to investigate robust ways for generating summaries from summary representations without recurring to simple sentence extraction and aiming at more human-like summaries. This is motivated by empirical evidence from TAC 2009 data showing that human summaries contain on average more and shorter sentences than the system summaries. We report encouraging preliminary results comparable to those attained by participating systems at TAC 2009.JRC.DG.G.2-Global security and crisis managemen
Improving the confidence of Machine Translation quality estimates
We investigate the problem of estimating the quality of the output of machine translation systems at the sentence level when reference translations are not available. The focus is on automatically identifying a threshold to map a continuous predicted score into “good ” / “bad ” categories for filtering out bad-quality cases in a translation post-edition task. We use the theory of Inductive Confidence Machines (ICM) to identify this threshold according to a confidence level that is expected for a given task. Experiments show that this approach gives improved estimates when compared to those based on classification or regression algorithms without ICM.
L\u2019Aquila, 6 aprile 2009: la gestione dell\u2019emergenza, la promozione della coesione e della salute sociale
Il presente lavoro nasce nell\u2019ambito del Progetto Vela, che si pone come obiettivo generale "la promozione della salute\u201d in comunit\ue0 colpite da emergenza sia naturale che umanitaria. Il Progetto \ue8 un\u2019iniziativa elaborata da un gruppo di ricercato-
ri afferenti all\u2019Universit\ue0 degli Studi di Padova (dipartimento FISPPA \u2013 Filosofia, Sociologia, Pedagogia e Psicologia Applicata), nato nell\u2019ottobre 2011 con l\u2019obietti-
vo di indagare quali siano state le ricadute negli assetti interattivi della comunit\ue0 aquilana, ossia come essa configuri la propria realt\ue0 sociale, in seguito al sisma del 6 aprile 2009. L\u2019incipit dell\u2019articolo consiste in una riflessione teorico-conoscitiva sulla relazione tra \u201ccatastrofe\u201d, \u201csalute\u201d ed \u201cemergenza\u201d, che ha porta-to ad assumere la rilevanza di indagarli per come sono configurati dai membri della comunit\ue0, anzich\ue9 considerarli entit\ue0 statiche di per s\ue9. Coerentemente con questi assunti, attraverso appositi protocolli di indagine, sono state indagate le
modalit\ue0 discorsive che configurano la "salute" del territorio aquilano prima del sisma, nelle ore di urgenza del post-sisma, allo stato attuale e in proiezione futura.
I protocolli sono stati somministrati a diversi ruoli (cittadini, commercianti, insegnanti, forze dell\u2019ordine, operatori della protezione civile, medici e psicologi), in
modo da raccogliere il testo di tutte le voci della comunit\ue0 aquilana. Quanto emerso ha mostrato che gli aquilani tuttora configurano la loro comunit\ue0 come "catastrofica" e dunque associata all\u2019evento sismico; dunque quest\u2019ultimo ha pervaso, e pervade, la biografia della comunit\ue0 aquilana (sia in prospettiva passata, che presente, che futura) con alto tasso di potenziale disgregazione sociale
- …
