Text to Speech Synthesis: New Paradigms and Advances
Shrikanth Narayanan, Abeer Alwan
Prentice Hall Professional Technical Reference, 2005 - Computers - 257 pages
Text to speech synthesis (TTS) is a critical research and application area in the field of multimedia interfaces. Recent advances in TTS will impact is wide number of disciplines from education, business and entertainment applications to medical aids. Until recently, speech synthesis relied on models and rule-based approaches. While this had yielded intelligible sounding speech, the voice quality was unacceptable for widespread adoption. Fortunately, there has been a major technological paradigm shift recently in how speech synthesis is done: going from rule-based to explicit data-driven methods. Recent advances in computing and corpus driven methodologies have yielded exciting possibilities for research and development in this domain yielding highly natural sounding speech. The book focuses on recent advances and new paradigms in text to speech synthesis contributed by leading experts from both academia and industry from across the world. There is no book of this nature that documents in a comprehensive way the recent research trends. This is not only important for researchers and students of the field but potential customers and other benefactors of the results. The book's chapters address key current topic areas in text to speech synthesis (TTS): Data-driven systems, unit selection Hybrid Schemes: interplay between data-driven and knowledge-based techniques, prosody models and generation and expressive speech synthesis.
12 pages matching voice quality difference in this book
Results 1-3 of 12
What people are saying - Write a review
We haven't found any reviews in the usual places.
REDUCING DISCONTINUITIES AT SYNTHESIS TIME
VOICE QUALITY VARIATION IN A LONGTERM
9 other sections not shown
accent acoustic measures algorithm analysis approach articulatory synthesis cepstral computed concatenation cost concatenative speech synthesis corpus-based described diphone discontinuities distance measures emotional speech Eurospeech evaluation example expressive speech factorization scheme feature vectors Figure formant frames frequency glottal hidden Markov models HMM-based speech synthesis ICASSP ICSLP IEEE input inventory join cost function listening tests LSFs Mahalanobis distance MBROLA method MFCCs n-gram natural speech neutral optimal output palatalized consonant parameterizations parameters pause perceptual experiment phoneme pitch contour prediction Proc prosody prosody models recorded rule-based Section segments sentences sequence Signal Processing smoothing speaker speaker recognition speech corpus speech rate speech recognition speech signal speech synthesis system speech units SSML stimuli syllable synthetic speech tags techniques text to speech tion ToBI Tokuda TTS system unit selection unvoiced utterances values variables vocal tract voice quality difference voice quality variation vowel waveform weighted word