Incorporating Knowledge Sources into Statistical Speech RecognitionIncorporating Knowledge Sources into Statistical Speech Recognition addresses the problem of developing efficient automatic speech recognition (ASR) systems, which maintain a balance between utilizing a wide knowledge of speech variability, while keeping the training / recognition effort feasible and improving speech recognition performance. The book provides an efficient general framework to incorporate additional knowledge sources into state-of-the-art statistical ASR systems. It can be applied to many existing ASR problems with their respective model-based likelihood functions in flexible ways. |
Contents
1 | |
Statistical Speech Recognition | 19 |
Graphical Framework to Incorporate Knowledge Sources | 54 |
82 | 129 |
Conclusions and Future Directions | 138 |
TIMIT AcousticPhonetic Speech Corpus | 146 |
B ATR Software Tools 153 | 152 |
Composition of Bayesian Widephonetic Context | 163 |
References | 175 |
139 | 177 |
Index | 189 |
Other editions - View all
Common terms and phrases
accent acoustic model additional knowledge sources algorithm amount of training approach arg max ATRASR automatic speech recognition Bayesian network BN topology structure BTEC cepstral cepstrum clustering coarticulation conditional relationship context units corpus database decoding described in Section feature extraction feature vector files following context hidden Markov models HMM phonetic HMM/BN model hypothesis ICASSP IEEE incorporate additional knowledge Incorporating Knowledge Sources incorporating various inference junction tree junction tree algorithm knowledge-based language model likelihood LR-HMM/BN LVCSR MFCC mixture components monophone N-best nodes noise obtained output probability P-value parameter pattern recognition pentaphone HMM performance phonetic phonetic context probability function Proc pronunciation proposed HMM/BN proposed pentaphone Recognition accuracy rates rescoring second preceding sequence shown in Figure Sign test speakers speech signal SSS data test set total number training data triphone HMM baseline triphone model utterances variables wide-phonetic context