Intelligent Audio AnalysisThis book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition. Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of methods and leads to a holistic audio analysis. |
What people are saying - Write a review
We haven't found any reviews in the usual places.
Contents
2 | |
3 | |
7 | |
13 | |
15 | |
16 | |
5 Audio Data | 23 |
6 Audio Features | 41 |
Part IIIIntelligent Audio Analysis Applications | 166 |
10 Applications in Intelligent Speech Analysis | 167 |
11 Applications in Intelligent Music Analysis | 225 |
12 Applications in Intelligent Sound Analysis | 299 |
Part IVConclusion | 315 |
13 Discussion | 316 |
14 Vision | 335 |
Appendix openSMILE Standardised Feature Sets | 339 |
Other editions - View all
Common terms and phrases
annotation application approach audio feature audio signal automatic ballroom dance ballroom dance style Batliner BLSTM cepstral challenge chord classification coefficients components computed ConceptNet conditional random fields Conference on Acoustics database domain emotion recognition estimation evaluation example Eyben feature extraction feature set feature space feature vector filter formants frequency function further Gaussian gender ICASSP IEEE IEEE International Conference IEEE Trans input instances Intelligent Audio Analysis International Speech Communication ISCA ISMIR labels learning algorithm linear LSTM matrix Metacritic methods metre MFCC mood Music Information Retrieval N-grams neural networks noise non-negative matrix factorization normalised output paralinguistic parameters partition pitch class Proceedings INTERSPEECH regression Rigoll robust Schuller Sect semi-supervised learning sequence shown Signal Processing speaker spectral spectrum Speech Communication Association speech recognition Steidl Table target tasks tempo traits valence vocalisations Weninger Wöllmer words Workshop