Search CORE

349 research outputs found

Sound Event Detection in Synthetic Audio: Analysis of the DCASE 2016 Task Results

Author: Benetos Emmanouil
Lafay Grégoire
Lagrange Mathieu
Publication venue
Publication date: 01/09/2017
Field of study

As part of the 2016 public evaluation challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2016), the second task focused on evaluating sound event detection systems using synthetic mixtures of office sounds. This task, which follows the `Event Detection - Office Synthetic' task of DCASE 2013, studies the behaviour of tested algorithms when facing controlled levels of audio complexity with respect to background noise and polyphony/density, with the added benefit of a very accurate ground truth. This paper presents the task formulation, evaluation metrics, submitted systems, and provides a statistical analysis of the results achieved, with respect to various aspects of the evaluation dataset

arXiv.org e-Print Archive

Hal-Diderot

The bag-of-frames approach: a not so sufficient model for urban soundscapes

Author: Aucouturier Jean-Julien
Defreville Boris
Lafay Grégoire
Lagrange Mathieu
Publication venue
Publication date: 20/10/2015
Field of study

The "bag-of-frames" approach (BOF), which encodes audio signals as the long-term statistical distribution of short-term spectral features, is commonly regarded as an effective and sufficient way to represent environmental sound recordings (soundscapes) since its introduction in an influential 2007 article. The present paper describes a concep-tual replication of this seminal article using several new soundscape datasets, with results strongly questioning the adequacy of the BOF approach for the task. We show that the good accuracy originally re-ported with BOF likely result from a particularly thankful dataset with low within-class variability, and that for more realistic datasets, BOF in fact does not perform significantly better than a mere one-point av-erage of the signal's features. Soundscape modeling, therefore, may not be the closed case it was once thought to be. Progress, we ar-gue, could lie in reconsidering the problem of considering individual acoustical events within each soundscape

arXiv.org e-Print Archive

On the visual display of audio data using stacked graphs

Author: Lafay Grégoire
Lagrange Mathieu
Rossignol Mathias
Publication venue: HAL CCSD
Publication date: 02/05/2014
Field of study

Visualisation is an important tool for many steps of a research project. In this paper, we present several displays of audio data based on stacked graphs. Thanks to a careful use of the layering the proposed displays concisely convey a large amount of information. Many flavours are presented, each useful for a specific type of data, from spectral and chromatic data to multi-source and multi channel data. We shall demonstrate that such displays for the case of spectral and chromatic data offer a different compromise than the traditional spectrogram and chroma gram, emphasizing timing information over frequency

Hal - Université Grenoble Alpes

A novel interface for audio based sound data mining

Author: Lafay Gregoire
Lagrange Mathieu
Misdariis Nicolas
Rossignol Mathias
Publication venue: HAL CCSD
Publication date: 28/10/2014
Field of study

In this paper, the design of a web interface for audio-based sound data mining is studied. The interface allows the user to explore a sound dataset without any written textual hint. Dataset sounds are grouped into semantic classes which are themselves clustered to build a semantic hierarchical struc-ture. Each class is represented by a circle distributed on a two dimensional space according to its semantic level. Sev-eral means of displaying sounds following this template are presented and evaluated with a crowdsourcing experiment

Hal - Université Grenoble Alpes

Large-scale feature selection with Gaussian mixture models for the classification of high dimensional remote sensing images

Author: Fauvel Mathieu
Grizonnet Manuel
Lagrange Adrien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

A large-scale feature selection wrapper is discussed for the classification of high dimensional remote sensing. An efficient implementation is proposed based on intrinsic properties of Gaussian mixtures models and block matrix. The criterion function is split into two parts:one that is updated to test each feature and one that needs to be updated only once per feature selection. This split saved a lot of computation for each test. The algorithm is implemented in C++ and integrated into the Orfeo Toolbox. It has been compared to other classification algorithms on two high dimension remote sensing images. Results show that the approach provides good classification accuracies with low computation time

Crossref

Open Archive Toulouse Archive Ouverte

HAL Descartes

Hal-Diderot

On the visual display of audio data using stacked graphs

Author: Lafay Grégoire
Lagrange Mathieu
Rossignol Mathias
Publication venue: HAL CCSD
Publication date: 02/05/2014
Field of study

Hal - Université Grenoble Alpes

SimScene : a web-based acoustic scenes simulator

Author: Lafay Gregoire
Lagrange Mathieu
Misdariis Nicolas
Rossignol Mathias
Publication venue: HAL CCSD
Publication date: 26/01/2015
Field of study

International audienceWe introduce in this paper a soundscape simulator called SimScene, designed to be used as an experimental tool to characterize the mental representation of sound environments. The soundscape simulator allows a subject to generate a full sonic environment by sequencing and mixing sound elements, and manipulating their sound level and time positioning. To make the simulation process effective, SimScene has not be designed to manipulate individual parameters of individ-ual sounds, but to specify high-level parameters for whole classes of sounds, organized into a hierarchical semantically structured dataset. To avoid any linguistic bias, a listening oriented interface allows subjects to explore the dataset with-out any text written help. The entire software is developed in Javascript using the standard Web Audio technology, and is thus fully supported by most modern web browsers. This fact should allow experimenters to adopt a crowdsourcing approach to experimentation in order to assess hypotheses on large populations, and facilitate the development of ex-perimental protocols to investigate the influence of socio-cultural background on soundscape perception