Loading...

Speech synthesis based on gaussian conditional random fields

Khorram, S ; Sharif University of Technology

592 Viewed
  1. Type of Document: Article
  2. DOI: 10.1007/978-3-319-10849-0_19
  3. Abstract:
  4. Hidden Markov Model (HMM)-based synthesis (HTS) has recently been confirmed to be the most effective method in generating natural speech. However, it lacks adequate context generalization when the training data is limited. As a solution, current study provides a new context-dependent speech modeling framework based on the Gaussian Conditional Random Field (GCRF) theory. By applying this model, an innovative speech synthesis system has been developed which can be viewed as an extension of Context-Dependent Hidden Semi Markov Model (CD-HSMM). A novel Viterbi decoder along with a stochastic gradient ascent algorithm was applied to train model parameters. Also, a fast and efficient parameter generation algorithm was derived for the synthesis part. Experimental results using objective and subjective criteria have shown that the proposed system outperforms HSMM substantially in limited speech databases. Moreover, Mel-cepstral distance of the spectral parameters has been reduced considerably for any size of training database
  5. Keywords:
  6. Gaussian conditional random field ; HSMM extension ; Statistical parametric speech synthesis ; Conditional random field ; Gaussians
  7. Source: Communications in Computer and Information Science ; Vol. 427, issue , 2014 , p. 183-193
  8. URL: http://link.springer.com/chapter/10.1007%2F978-3-319-10849-0_19