21 research outputs found
Learning to communicate with Deep multi-agent reinforcement learning
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains
Machine learning for ancient languages: a survey
Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the study of ancient texts, but in recent years advances in artificial intelligence and machine learning have enabled analyses on a scale and in a detail that are reshaping the field of humanities, similarly to how microscopes and telescopes have contributed to the realm of science. This article aims to provide a comprehensive survey of published research using machine learning for the study of ancient texts written in any language, script, and medium, spanning over three and a half millennia of civilizations around the ancient world. To analyze the relevant literature, we introduce a taxonomy of tasks inspired by the steps involved in the study of ancient documents: digitization, restoration, attribution, linguistic analysis, textual criticism, translation, and decipherment. This work offers three major contributions: first, mapping the interdisciplinary field carved out by the synergy between the humanities and machine learning; second, highlighting how active collaboration between specialists from both fields is key to producing impactful and compelling scholarship; third, highlighting promising directions for future work in this field. Thus, this work promotes and supports the continued collaborative impetus between the humanities and machine learning
DualLip: A System for Joint Lip Reading and Generation
Lip reading aims to recognize text from talking lip, while lip generation
aims to synthesize talking lip according to text, which is a key component in
talking face generation and is a dual task of lip reading. In this paper, we
develop DualLip, a system that jointly improves lip reading and generation by
leveraging the task duality and using unlabeled text and lip video data. The
key ideas of the DualLip include: 1) Generate lip video from unlabeled text
with a lip generation model, and use the pseudo pairs to improve lip reading;
2) Generate text from unlabeled lip video with a lip reading model, and use the
pseudo pairs to improve lip generation. We further extend DualLip to talking
face generation with two additionally introduced components: lip to face
generation and text to speech generation. Experiments on GRID and TCD-TIMIT
demonstrate the effectiveness of DualLip on improving lip reading, lip
generation, and talking face generation by utilizing unlabeled data.
Specifically, the lip generation model in our DualLip system trained with
only10% paired data surpasses the performance of that trained with the whole
paired data. And on the GRID benchmark of lip reading, we achieve 1.16%
character error rate and 2.71% word error rate, outperforming the
state-of-the-art models using the same amount of paired data.Comment: Accepted by ACM Multimedia 202
