431 research outputs found

    Automatic Quality Estimation for ASR System Combination

    Get PDF
    Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%

    Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English

    Get PDF
    The necessity of using a fixed-size word vocabulary in order to control the model complexity in state-of-the-art neural machine translation (NMT) systems is an important bottleneck on performance, especially for morphologically rich languages. Conventional methods that aim to overcome this problem by using sub-word or character-level representations solely rely on statistics and disregard the linguistic properties of words, which leads to interruptions in the word structure and causes semantic and syntactic losses. In this paper, we propose a new vocabulary reduction method for NMT, which can reduce the vocabulary of a given input corpus at any rate while also considering the morphological properties of the language. Our method is based on unsupervised morphology learning and can be, in principle, used for pre-processing any language pair. We also present an alternative word segmentation method based on supervised morphological analysis, which aids us in measuring the accuracy of our model. We evaluate our method in Turkish-to-English NMT task where the input language is morphologically rich and agglutinative. We analyze different representation methods in terms of translation accuracy as well as the semantic and syntactic properties of the generated output. Our method obtains a significant improvement of 2.3 BLEU points over the conventional vocabulary reduction technique, showing that it can provide better accuracy in open vocabulary translation of morphologically rich languages.Comment: The 20th Annual Conference of the European Association for Machine Translation (EAMT), Research Paper, 12 page

    Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources

    Get PDF
    Translation capability of a Phrase-Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efficiently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En-Fr and Fr-En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out-of-vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.JRC.G.2-Global security and crisis managemen

    DNN adaptation by automatic quality estimation of ASR hypotheses

    Full text link
    In this paper we propose to exploit the automatic Quality Estimation (QE) of ASR hypotheses to perform the unsupervised adaptation of a deep neural network modeling acoustic probabilities. Our hypothesis is that significant improvements can be achieved by: i)automatically transcribing the evaluation data we are currently trying to recognise, and ii) selecting from it a subset of "good quality" instances based on the word error rate (WER) scores predicted by a QE component. To validate this hypothesis, we run several experiments on the evaluation data sets released for the CHiME-3 challenge. First, we operate in oracle conditions in which manual transcriptions of the evaluation data are available, thus allowing us to compute the "true" sentence WER. In this scenario, we perform the adaptation with variable amounts of data, which are characterised by different levels of quality. Then, we move to realistic conditions in which the manual transcriptions of the evaluation data are not available. In this case, the adaptation is performed on data selected according to the WER scores "predicted" by a QE component. Our results indicate that: i) QE predictions allow us to closely approximate the adaptation results obtained in oracle conditions, and ii) the overall ASR performance based on the proposed QE-driven adaptation method is significantly better than the strong, most recent, CHiME-3 baseline.Comment: Computer Speech & Language December 201

    Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018

    Get PDF
    This paper describes FBK's submission to the end-to-end English-German speech translation task at IWSLT 2018. Our system relies on a state-of-the-art model based on LSTMs and CNNs, where the CNNs are used to reduce the temporal dimension of the audio input, which is in general much higher than machine translation input. Our model was trained only on the audio-to-text parallel data released for the task, and fine-tuned on cleaned subsets of the original training corpus. The addition of weight normalization and label smoothing improved the baseline system by 1.0 BLEU point on our validation set. The final submission also featured checkpoint averaging within a training run and ensemble decoding of models trained during multiple runs. On test data, our best single model obtained a BLEU score of 9.7, while the ensemble obtained a BLEU score of 10.24.Comment: 6 pages, 2 figures, system description at the 15th International Workshop on Spoken Language Translation (IWSLT) 201

    Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary

    Full text link
    We propose a method to transfer knowledge across neural machine translation (NMT) models by means of a shared dynamic vocabulary. Our approach allows to extend an initial model for a given language pair to cover new languages by adapting its vocabulary as long as new data become available (i.e., introducing new vocabulary items if they are not included in the initial model). The parameter transfer mechanism is evaluated in two scenarios: i) to adapt a trained single language NMT system to work with a new language pair and ii) to continuously add new language pairs to grow to a multilingual NMT system. In both the scenarios our goal is to improve the translation performance, while minimizing the training convergence time. Preliminary experiments spanning five languages with different training data sizes (i.e., 5k and 50k parallel sentences) show a significant performance gain ranging from +3.85 up to +13.63 BLEU in different language directions. Moreover, when compared with training an NMT model from scratch, our transfer-learning approach allows us to reach higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language Translation (IWSLT), 201

    Wrapping up a Summary: from Representation to Generation

    Get PDF
    The main focus of this work is to investigate robust ways for generating summaries from summary representations without recurring to simple sentence extraction and aiming at more human-like summaries. This is motivated by empirical evidence from TAC 2009 data showing that human summaries contain on average more and shorter sentences than the system summaries. We report encouraging preliminary results comparable to those attained by participating systems at TAC 2009.JRC.DG.G.2-Global security and crisis managemen

    Improving the confidence of Machine Translation quality estimates

    Get PDF
    We investigate the problem of estimating the quality of the output of machine translation systems at the sentence level when reference translations are not available. The focus is on automatically identifying a threshold to map a continuous predicted score into “good ” / “bad ” categories for filtering out bad-quality cases in a translation post-edition task. We use the theory of Inductive Confidence Machines (ICM) to identify this threshold according to a confidence level that is expected for a given task. Experiments show that this approach gives improved estimates when compared to those based on classification or regression algorithms without ICM.

    L\u2019Aquila, 6 aprile 2009: la gestione dell\u2019emergenza, la promozione della coesione e della salute sociale

    Get PDF
    Il presente lavoro nasce nell\u2019ambito del Progetto Vela, che si pone come obiettivo generale "la promozione della salute\u201d in comunit\ue0 colpite da emergenza sia naturale che umanitaria. Il Progetto \ue8 un\u2019iniziativa elaborata da un gruppo di ricercato- ri afferenti all\u2019Universit\ue0 degli Studi di Padova (dipartimento FISPPA \u2013 Filosofia, Sociologia, Pedagogia e Psicologia Applicata), nato nell\u2019ottobre 2011 con l\u2019obietti- vo di indagare quali siano state le ricadute negli assetti interattivi della comunit\ue0 aquilana, ossia come essa configuri la propria realt\ue0 sociale, in seguito al sisma del 6 aprile 2009. L\u2019incipit dell\u2019articolo consiste in una riflessione teorico-conoscitiva sulla relazione tra \u201ccatastrofe\u201d, \u201csalute\u201d ed \u201cemergenza\u201d, che ha porta-to ad assumere la rilevanza di indagarli per come sono configurati dai membri della comunit\ue0, anzich\ue9 considerarli entit\ue0 statiche di per s\ue9. Coerentemente con questi assunti, attraverso appositi protocolli di indagine, sono state indagate le modalit\ue0 discorsive che configurano la "salute" del territorio aquilano prima del sisma, nelle ore di urgenza del post-sisma, allo stato attuale e in proiezione futura. I protocolli sono stati somministrati a diversi ruoli (cittadini, commercianti, insegnanti, forze dell\u2019ordine, operatori della protezione civile, medici e psicologi), in modo da raccogliere il testo di tutte le voci della comunit\ue0 aquilana. Quanto emerso ha mostrato che gli aquilani tuttora configurano la loro comunit\ue0 come "catastrofica" e dunque associata all\u2019evento sismico; dunque quest\u2019ultimo ha pervaso, e pervade, la biografia della comunit\ue0 aquilana (sia in prospettiva passata, che presente, che futura) con alto tasso di potenziale disgregazione sociale
    corecore