Search CORE

51 research outputs found

Evidence for attractors in English intonation

Author: Bartlett F. C.
Bettina Braun
Brazil D.
Burton S. Rosner
Esther Grabe
Greg Kochanski
Gussenhoven C.
Hirst D.
Hoeting J. A.
Holm B.
Kingdon R.
Kohler K. J.
O’Connor J. D.
Pierrehumbert J. B.
Plomp R.
Remijsen B.
Talkin D.
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2006
Field of study

Although the pitch of the human voice is continuously variable, some linguists contend that intonation in speech is restricted to a small, limited set of patterns. This claim is tested by asking subjects to mimic a block of 100 randomly generated intonation contours and then to imitate themselves in several successive sessions. The produced f0 contours gradually converge towards a limited set of distinct, previously recognized basic English intonation patterns. These patterns are "attractors" in the space of possible intonation English contours. The convergence does not occur immediately. Seven of the ten participants show continued convergence toward their attractors after the first iteration. Subjects retain and use information beyond phonological contrasts, suggesting that intonational phonology is not a complete description of their mental representation of intonation

KOPS - The Institutional Repository of the University of Konstanz

Crossref

Oxford University Research Archive

MPG.PuRe

Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method

Author: A de Cheveigne
AS Bregman
D Talkin
DL Wang
G Hu
G Hu
G Hu
GJ Brown
J Barker
J Le Roux
J Tabrikian
JJ Sroka
L Atlas
M Buchler
M Wu
MH Radfar
MP Cooke
Q Li
R Drullman
RP Lippmann
S Dubnov
SM Schimmel
SM Schimmel
TW Lee
X Huang
Y Shao
Y Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Speech Processing and Prosody

Author: A Camacho
A Cheveigné de
BW Schuller
D Jouvet
D Talkin
JR Novak
K Bartkova
L Orosanu
M Benzeghiba
M Bisani
M Dargnat
M Eskenazi
M Schröder
M Stede
P Martin
R Shadiev
RB Lanjewar
SM Witt
V Sethu
VM Quang
Publication venue: HAL CCSD
Publication date: 10/09/2019
Field of study

International audienceThe prosody of the speech signal conveys information over the linguistic content of the message: prosody structures the utterance, and also brings information on speaker's attitude and speaker's emotion. Duration of sounds, energy and fundamental frequency are the prosodic features. However their automatic computation and usage are not obvious. Sound duration features are usually extracted from speech recognition results or from a force speech-text alignment. Although the resulting segmentation is usually acceptable on clean native speech data, performance degrades on noisy or not non-native speech. Many algorithms have been developed for computing the fundamental frequency, they lead to rather good performance on clean speech, but again, performance degrades in noisy conditions. However, in some applications, as for example in computer assisted language learning, the relevance of the prosodic features is critical; indeed, the quality of the diagnostic on the learner's pronunciation will heavily depend on the precision and reliability of the estimated prosodic parameters. The paper considers the computation of prosodic features, shows the limitations of automatic approaches, and discusses the problem of computing confidence measures on such features. Then the paper discusses the role of prosodic features and how they can be handled for automatic processing in some tasks such as the detection of discourse particles, the characterization of emotions, the classification of sentence modalities, as well as in computer assisted language learning and in expressive speech synthesis

Crossref

INRIA a CCSD electronic archive server

Sound generation by unsteady flow ejecting from the vibrating glottis based on a distributed parameter model of the vocal cords

Author: B. H. Story and I. R. Titze
C. M. Sapienza E. T. Stathopulos a
D. G. Hanson J. Jiang, M. M. D
H. Nomura and T. Funada
Hideyuki Nomura
I. R. Titze and D. T. Talkin
J. Jiang T. O&rsquo
K. Ishizaka and J. L. Flanagan
P. Lieberman
Q. T. Tran G. S. Berke, B. R. Gerr
R. McGlone and T. Shipp
S. Granqvist S. Herteg&aring
T. J. Hixon D. H. Klatt and J. Mea
Tetsuo Funada
Y.-W. Shau C.-L. Wang, F.-J. Hsieh
Publication venue: 'Acoustical Society of Japan'
Publication date: 01/01/2007
Field of study

Crossref

Compact speech representations for speech synthesis

Author: D. Talkin
W. Bastiaan Kleijn
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date
Field of study

Crossref

Speaker transformation using sentence HMM based alignments and detailed prosody modification

Author: D. Talkin
L.M. Arslan
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date
Field of study

Crossref

Optimal Pitch Path Tracking for More Reliable Pitch Detection

Author: D. Talkin
J. Černocký
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Estimating Age-Dependent Degradation Using Nonverbal Feature Analysis of Daily Conversation

Author: D Talkin
K Sato
M Nishio
PB Nueller
Y Tanaka
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Mathematical Morphology Preprocessing to Mitigate AWGN Effects: Improving Pitch Tracking Performance in Hard Noise Conditions

Author: B.-H. Juang
D. Talkin
J. Serra
L.R. Rabiner
W. Hess
Z. Xiaoqun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Abstract. In this paper we show how a nonlinear preprocessing of speech signal -with high noise- based on morphological filters improves the performance of robust algorithms for pitch tracking (RAPT). This result happens for a very simple morphological filter. More sophisticated ones could even improve such results. Mathematical morphology is widely used in image processing in where it has found a great amount of applications. Almost all its formulations derived in the two-dimensional framework are easily reformulated to be adapted to onedimensional context

Crossref

Secretaría de Estado de Cultura

RIUVic

Burst and Transition Cues to Voicing Perception for Spoken Initial Stops by Impaired- and Normal-Hearing Listeners

Author: David Talkin
Fred D. Brandt
James M. Pickett
Lisa D. Holden-Pitt
Sally Revoile
Publication venue: 'American Speech Language Hearing Association'
Publication date
Field of study

Crossref