Speech Technology - October 2008 - (Page 16) COVER STORY human interactions. Compounding the complexity is the need to devise viable business cases that can justify the significant investments needed in this nascent area. While some progress has been made, a vast amount of work still needs to be completed before emotional utterances are common in speech synthesis systems. Driving Forces A number of factors are driving the interest to add emotion to speech system output. Proponents note that there are many potential business benefits. In today’s highly competitive, rapidly changing marketplace, companies are looking for ways to interact with customers more effectively. The emerging speech technology features can be integrated into a variety of applications: automated call centers, customer relationship management, news and email reading, self-service applications, live news, business documentation, e-learning, and even entertainment. Theoretically, the change will lead to richer, more effective interactions and result in improved customer satisfaction, increased business efficiency, and, most important, more revenue. Another factor pushing this change is the natural progression of speech technology. “We have reached the point where we can deliver systems that are quite intelligible, so speech system design goals are now shifting to make voice systems sound more like people and less like machines,” states Andy Aaron, a research scientist at the IBM T.J. Watson Research Center. To add such capabilities, vendors must first clear a number of hurdles. One of the biggest challenges is finding a baseline for comparing emotional and unemotional speech. “As of now, the classifications needed to identify emotional speech have not been well-known or widely accepted,” says Jim Larson, an independent consultant and VoiceXML trainer at Larson Technical Services. In universities and vendor research and development laboratories, scientists have been trying to identify and quantify the metrics needed to correlate emotions to patterns of speech. Challenges have emerged right from the start. “Expressive speech is multidimensional: One can say ‘happy’ or ‘sad,’ but those words often have various connotations to different speakers,” IBM’s Aaron explains. Consequentially, companies find themselves needing to include consciously increasing intelligibility (a speaker will alter his speech for a non-native listener or due to increased background noise), familiarity (a speaker will speak more carefully to a listener with whom he is not familiar), and social status (a speaker will speak to a child differently from the way he would speak to a peer, and would speak in a different way again to a listener in a socially dominant position, such as a boss). A Positive Outlook Speakers’ outlooks can change as well. If a person has been successful (say, getting a big raise at work), that success will affect the way he communicates with others throughout the day. Different emotional states affect the speech production mechanism and lead to acoustical changes in the individual’s speech patterns. Physiological factors also play a role in how we communicate. Stress can have a dramatic impact on how one talks; a person’s tone, emphasis, and even speed can change significantly if she is under a great deal of stress. Other factors, such as fatigue, illness, or the effects of drugs and alcohol, can also alter verbal exchanges between people. While these factors are essentially independent of each speaker, they all manifest themselves to some degree in all exchanges. So researchers have to first figure out what the changes are that each of these factors produce, and then group them together to accurately deduce how speakers evoke different emotional states. The challenge is, for the most part, that these speech variabilities are produced unconsciously. Even when a speaking style is adopted consciously by a person, the actual vocal changes (varying pitch or pace) are often made unconsciously. Therefore, identifying the changes and developing precise descriptions of how they are produced can be hard. “SPEECH SYSTEM DESIGN GOALS ARE NOW SHIFTING TO MAKE VOICE PEOPLE AND LESS LIKE MACHINES.” SYSTEMS SOUND MORE LIKE quantify traits that lack a consensus set of features. Another problem is that of variability. Different speakers say things in different ways at both a linguistic and vocal level. Further exacerbating the task, there is also considerable variability even within the speech of a single speaker. A speaker will not necessarily use the same words to say the same thing twice, and even different instances of the same word will not always be acoustically identical. A number of factors contribute to this variability. Speakers alter their way of talking in response to a number of conditions related to their environments and their status relative to those to whom they are speaking. Such attitudinal conditions 16 | Speech Technology OCTOBER 2008 www.speechtechmag.com http://www.speechtechmag.com
Table of Contents Feed for the Digital Edition of Speech Technology - October 2008 Speech Technology - October 2008 Contents Editor’s Letter Industry View Inside Outsourcing Interact Keynoter Highlights the Shrinking Technological World Former Hacker Tackles IVR and Biometrics ‘Press 1’ for Caller Thoughts Soundbytes Voice Vote A New Dragon Emerges Overheard/Underheard An Emotional Mess Emotional Intelligence The Case for Call Recording Unified in Care and Communications An Education in E-Learning Guest Column Standards Speech Solutions Voice Value Forward Thinking Speech Technology - October 2008 Speech Technology - October 2008 - Speech Technology - October 2008 (Page Cover1) Speech Technology - October 2008 - Speech Technology - October 2008 (Page Cover2) Speech Technology - October 2008 - Contents (Page 1) Speech Technology - October 2008 - Editor’s Letter (Page 2) Speech Technology - October 2008 - Editor’s Letter (Page 3) Speech Technology - October 2008 - Industry View (Page 4) Speech Technology - October 2008 - Industry View (Page 5) Speech Technology - October 2008 - Inside Outsourcing (Page 6) Speech Technology - October 2008 - Interact (Page 7) Speech Technology - October 2008 - Keynoter Highlights the Shrinking Technological World (Page 8) Speech Technology - October 2008 - ‘Press 1’ for Caller Thoughts (Page 9) Speech Technology - October 2008 - Soundbytes (Page 10) Speech Technology - October 2008 - Voice Vote (Page 11) Speech Technology - October 2008 - A New Dragon Emerges (Page 12) Speech Technology - October 2008 - Overheard/Underheard (Page 13) Speech Technology - October 2008 - An Emotional Mess (Page 14) Speech Technology - October 2008 - An Emotional Mess (Page 15) Speech Technology - October 2008 - An Emotional Mess (Page 16) Speech Technology - October 2008 - An Emotional Mess (Page 17) Speech Technology - October 2008 - An Emotional Mess (Page 18) Speech Technology - October 2008 - An Emotional Mess (Page 19) Speech Technology - October 2008 - Emotional Intelligence (Page 20) Speech Technology - October 2008 - Emotional Intelligence (Page 21) Speech Technology - October 2008 - Emotional Intelligence (Page 22) Speech Technology - October 2008 - Emotional Intelligence (Page 23) Speech Technology - October 2008 - Emotional Intelligence (Page 24) Speech Technology - October 2008 - Emotional Intelligence (Page 25) Speech Technology - October 2008 - The Case for Call Recording (Page 26) Speech Technology - October 2008 - The Case for Call Recording (Page 27) Speech Technology - October 2008 - The Case for Call Recording (Page 28) Speech Technology - October 2008 - The Case for Call Recording (Page 29) Speech Technology - October 2008 - The Case for Call Recording (Page 30) Speech Technology - October 2008 - The Case for Call Recording (Page 31) Speech Technology - October 2008 - The Case for Call Recording (Page 32) Speech Technology - October 2008 - The Case for Call Recording (Page 33) Speech Technology - October 2008 - Unified in Care and Communications (Page 34) Speech Technology - October 2008 - Unified in Care and Communications (Page 35) Speech Technology - October 2008 - An Education in E-Learning (Page 36) Speech Technology - October 2008 - An Education in E-Learning (Page 37) Speech Technology - October 2008 - Guest Column (Page 38) Speech Technology - October 2008 - Guest Column (Page 39) Speech Technology - October 2008 - Standards (Page 40) Speech Technology - October 2008 - Speech Solutions (Page 41) Speech Technology - October 2008 - Voice Value (Page 42) Speech Technology - October 2008 - Voice Value (Page 43) Speech Technology - October 2008 - Forward Thinking (Page 44) Speech Technology - October 2008 - Forward Thinking (Page Cover3) Speech Technology - October 2008 - Forward Thinking (Page Cover4)
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.