Date of Award

2012

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Engineering

First Advisor

Graves, Corey A. Dr.

Abstract

With the exponential growth of social media in today's society there is a growth in the amount of data that is disseminated on these sites. With the amount of communication that takes place on Social Media sites such as Twitter and Facebook, there are not many ways to express ones emotions via sites aside from emoticons and Computer Language (i.e. LOL, SMH etc.). Studies have shown the there is a correlation between what happens on social media and ones emotions [3]. Studies also show that users tend to express their opinions more via Social Media rather than in person [3]. While there are TTS Applications that can be used to convert the text to speech, the speech that is created lacks the emotion that the user would have used if they were to have said it themselves. The PSEE VOICE (Pervasive System for Educational Enhancement) is designed as add-on system to the UMSEE (Ubiquitous Mobile System for Educational Enhancement) project designed by Dr. Corey Graves and Brandon Judd. By using Twitter, Cinch (Audio Twitter equivalent), and Espeak a TTS Application the PSEE VOICE system can register a user’s voice in the database and convert text sent in via Twitter. The purpose of the system is to allow users to send text message that will sound more human like. This thesis addresses the area of pitch and word transition and how it affects the clarity of a speech signal. THE PSEE VOICE uses the Weighted Frequency ix Warping and a Gaussian Mixture Model Based Transformation function. The system will be based on training data that is used for registration, the transformation that is used for voice conversion, and the order of the Gaussian Mixture Models that are used for the training. The system proves that having more training sets, using 16th order GMM for training and using the GMM Based Transformation yields the clearest results.

Share

COinS