Automatic Speech Recognition: A Deep Learning ApproachThis book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models including deep neural networks and many of their variants. This is the first automatic speech recognition book dedicated to the deep learning approach. In addition to the rigorous mathematical treatment of the subject, the book also presents insights and theoretical foundation of a series of highly successful deep learning models. |
Contents
1 | |
10 | |
11 | |
3 Hidden Markov Models and the Variants | 23 |
Part IIDeep Neural Networks | 55 |
4 Deep Neural Networks | 56 |
5 Advanced Model Initialization Techniques | 79 |
Part IIIDeep Neural NetworkHidden MarkovModel Hybrid Systems for AutomaticSpeech Recognition | 96 |
Part IVRepresentation Learningin Deep Neural Networks | 154 |
9 Feature Representation Learning in Deep Neural Networks | 157 |
10 Fuse Deep Neural Network and Gaussian Mixture Model Systems | 176 |
11 Adaptation of Deep Neural Networks | 193 |
Part VAdvanced Deep Models | 216 |
12 Representation Sharing and Transfer in Deep Neural Networks | 219 |
13 Recurrent Neural Networks and Related Models | 236 |
14 Computational Network | 267 |
Other editions - View all
Common terms and phrases
6xij Acero acoustic modeling adaptation Annual Conference applied architecture autoencoder automatic speech recognition backpropagation batch Communication Association INTERSPEECH Conference of International Conference on Acoustics context-dependent dataset decoding deep learning deep neural networks Deng distribution DNN training DNN-HMM error rate estimation frames function Gaussian mixture GMM-HMM system GPUs gradient hidden layers hidden Markov models hybrid IEEE IEEE Trans improve input features International Conference International Speech Communication labels language lattice learning rate linear LSTM Machine Learning Markov chain minibatch model parameters neurons nonlinear observation optimization output layer posterior probability pretraining Proceedings of International recurrent neural networks reduce representation robust sample Seide senones sequence sequence-discriminative training SHL-MDNN Signal Processing ICASSP softmax softmax layer speaker Speech and Signal Speech Communication Association speech features stochastic stochastic gradient descent techniques test set training criterion training data training set transformation vector word error rate