Speech Technology - June 2008 - (Page 38) BEYOND Misrecognitions No Speech AND How a larger taxonomy of errors can inspire change | BY REBECCA NOWLIN GREEN t SpeechTEK West in San Francisco last year, a group of voice user interface designers participated in a workshop directed by James Larson of Larson Technical Services and Lizanne Kaiser of Genesys Telecommunications Laboratories. Participants developed recommendations for dealing with error messages in speech applications from four perspectives. This is the last in a series of four articles summarizing the recommendations reached by these experts. Contributors to this article are: • REBECCA NOWLIN GREEN, Nuance Communications; • LIZANNE KAISER, Genesys Telecommunications Laboratories; • EDUARDO OLVERA, Nuance Communications; and • GREG SIMSAR, Syntellect. In the previous three articles in this series, you’ve read some ideas for how to move beyond the usual I didn’t get that after a misrecognition error, or I didn’t hear anything after a failure to detect speech. However, in all of those cases, some pretty tight constraints still existed. It has generally been assumed that designers will know only how much speech was detected by the recognizer, what was recognized, and what confidence score the recognizer assigned to the result. And while it’s important to leverage a little more context, such as how many other errors have happened or what other recognition results might appear in the N-best list, we’re frequently left wishing for even more information. To illustrate how a lack of information can result in the same type of design we’ve been seeing since before the turn of the century, consider this scenario in which a mother at a home is trying to give her account number to a speech recognition application: System: What’s your account number? Caller: Excuse me. (Cough, cough, breath). System: I’m sorry, I didn’t get that. Please say the number again. Caller: OK, my account number is (Side speech: “Hey Mom! Where did you hide the peanut butter?” “Can’t you see I’m on the phone?”) System: Hmmm, I still didn’t get it. Say your nine-digit account number once more. Caller: One, eight, five (Doorbell, sigh). Do you mind if I get the door? System: Sorry I’m having so much trouble understanding. You can try calling back later, or hang on for the next available agent… Though the caller did nothing wrong, three errors occurred in the dialogue. In each exchange, the recognizer detected some speech but wasn’t confident what it heard was a complete account number. We know this because the error messages said things like I didn’t get that instead of I didn’t hear anything. However, just because the system asked for an account number doesn’t necessarily mean the caller gave a complete one. This design (like many) wrongly assumes that she did. That’s why it keeps asking the caller to say her account number “again.” Knowing more about what went wrong helps to make it right. Imagine if the same caller had been speaking to an agent instead of an automated system. None of the errors would have caused any confusion at all, as in this scenario: Agent: What’s your account number? 38 | Speech Technology JUNE 2008 www.speechtechmag.com http://www.speechtechmag.com
For optimal viewing of this digital publication, please enable JavaScript and then refresh the page. If you would like to try to load the digital publication without using Flash Player detection, please click here.