Dynamic Speech Models: Theory, Algorithms, and Applications

Front Cover
Morgan & Claypool Publishers, 2006 - Computers - 105 pages
0 Reviews
Speech dynamics refer to the temporal characteristics in all stages of the human speech communication process. This speech starts with the formation of a linguistic message in a speaker's brain and ends with the arrival of the message in a listener's brain. Given the intricacy of the dynamic speech process and its fundamental importance in human communication, this monograph is intended to provide a comprehensive material on mathematical models of speech dynamics and to address the following issues: How do we make sense of the complex speech process in terms of its functional role of speech communication? How do we quantify the special role of speech timing? How do the dynamics relate to the variability of speech that has often been said to seriously hamper automatic speech recognition? How do we put the dynamic process of speech into a quantitative form to enable detailed analyses? And finally, how can we incorporate the knowledge of speech dynamics into computerized speech analysis and recognition algorithms? The answers to all these questions require building and applying computational models for the dynamic speech process.
 

What people are saying - Write a review

We haven't found any reviews in the usual places.

Selected pages

Contents

Introduction
1
12 WHAT ARE MODELS OF SPEECH DYNAMICS?
4
13 WHY MODELING SPEECH DYNAMICS?
6
14 OUTLINE OF THE BOOK
7
A General Modeling and Computational Framework
9
22 MODEL DESIGN PHILOSOPHY AND OVERVIEW
11
23 MODEL COMPONENTS AND THE COMPUTATIONAL FRAMEWORK
13
232 Segmental Target Model
16
416 Decoding of Discrete States by Dynamic Programming
48
42 EXTENSION OF THE BASIC MODEL
49
422 Extension from Linear to Nonlinear Mapping
50
423 An Analytical Form of the Nonlinear Mapping Function
51
424 EStep for Parameter Estimation
57
425 MStep for Parameter Estimation
59
426 Decoding of Discrete States by Dynamic Programming
61
432 Experimental Results
63

233 Articulatory Dynamic Model
20
234 Functional Nonlinear Model for ArticulatorytoAcoustic Mapping
22
235 Weekly Nonlinear Model for Acoustic Distortion
24
236 Piecewise Linearized Approximation for ArticulatorytoAcoustic Mapping
26
24 SUMMARY
29
Modeling From Acoustic Dynamics to Hidden Dynamics
31
32 STATISTICAL MODELS FOR ACOUSTIC SPEECH DYNAMICS
32
321 NonstationaryState HMMs
33
322 Multiregion Recursive Models
34
33 STATISTICAL MODELS FOR HIDDEN SPEECH DYNAMICS
35
331 Multiregion Nonlinear Dynamic System Models
36
332 Hidden Trajectory Models
37
Models with DiscreteValued Hidden Speech Dynamics
39
411 Probabilistic Formulation of the Basic Model
40
Overview
41
414 A Generalized ForwardBackward Algorithm
43
The MStep
45
44 SUMMARY
65
Models with ContinuousValued Hidden Speech Trajectories
69
511 Generating Stochastic Hidden Vocal Tract Resonance Trajectories
70
512 Generating Acoustic Observation Data
73
514 Computing Acoustic Likelihood
74
52 UNDERSTANDING MODEL BEHAVIOR BY COMPUTER SIMULATION
76
522 Effects of Speaking Rate on Reduction
78
523 Comparisons with Formant Measurement Data
79
524 Model Prediction of Vocal Tract Resonance Trajectories for Real Speech Utterances
80
525 Simulation Results on Model Prediction for Cepstral Trajectories
82
53 PARAMETER ESTIMATION
84
532 Vocal Tract Resonance Targets Distributional Parameters
89
54 APPLICATION TO PHONETIC RECOGNITION
91
542 Experimental Results
92
55 SUMMARY
93
Copyright

Other editions - View all

Common terms and phrases

Popular passages

Page xi - Acknowledgments THIS BOOK WOULD NOT HAVE BEEN POSSIBLE WITHOUT the help and prayers of many people.
Page 97 - How do humans process and recognize speech? IEEE Trans. Speech Audio Process.
Page 103 - L. Deng, L. Lee, H. Attias, and A. Acero. "A structured speech model with continuous hidden dynamics and predictionresidual training for tracking vocal tract resonances,
Page 97 - H. BOURLARD, H. HERMANSKY, AND N. MORGAN. "Towards increasing speech recognition error rates,
Page 97 - No. (#115-9732388), and was carried out at the 1998 Workshop on Language Engineering, Center for Language and Speech Processing, Johns Hopkins University.
Page 103 - Pitermann, Michel. 2000. Effect of speaking rate and contrastive stress on formant dynamics and vowel perception.
Page 97 - An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition...

About the author (2006)

Microsoft Research

Bibliographic information